212 lines
8.6 KiB
Markdown
212 lines
8.6 KiB
Markdown
# Context manager: stronger pruning and earlier compaction
|
|
|
|
## Status
|
|
|
|
Approved for planning.
|
|
|
|
## Problem
|
|
|
|
`pi-context-manager` loads, but it does not materially cap live context growth.
|
|
|
|
Observed behavior:
|
|
- Context usage keeps climbing during long sessions.
|
|
- The extension mostly distills old bulky `toolResult` output, but it keeps most older user and assistant turns intact.
|
|
- Compaction still waits for Pi core's later reserve-token threshold.
|
|
- Raw compaction and branch-summary artifacts can remain in live context even though the extension already persists and replays the same information through its own ledger.
|
|
- Footer status noise is unnecessary for this package.
|
|
|
|
## Goals
|
|
|
|
1. Keep live context flatter during long sessions.
|
|
2. Trigger compaction earlier than Pi core's default reserve-token threshold.
|
|
3. Use lean resume injection, not raw latest compaction or branch-summary blobs.
|
|
4. Remove the persistent footer status line.
|
|
5. Preserve deterministic memory merging behavior, including stable exact-timestamp tie resolution.
|
|
|
|
## Non-goals
|
|
|
|
- Adding new LLM-facing tools.
|
|
- Broad extractor or summarizer redesign unrelated to live-context pressure.
|
|
- Editing `docs/extensions.md`.
|
|
- Changing the existing deterministic same-slot tie-break contract in `src/ledger.ts`.
|
|
|
|
## Constraints
|
|
|
|
- Do not wait for Pi core's much later reserve-token threshold.
|
|
- Resume injection should use lean packet text, not raw latest compaction or branch-summary blobs.
|
|
- Exact-timestamp ties must resolve the same way regardless of candidate processing order.
|
|
- Keep changes package-local and minimal.
|
|
|
|
## Design
|
|
|
|
### 1. Pre-filter raw summary artifacts from live context
|
|
|
|
Before turn-aware pruning runs, the extension should drop raw `compactionSummary` and `branchSummary` messages from the `context` event payload.
|
|
|
|
Reason:
|
|
- These artifacts are already persisted.
|
|
- They are already replayed into the extension ledger.
|
|
- A separate hidden packet/resume message is already injected.
|
|
- Keeping the raw artifacts in live context duplicates information and allows context growth even when the extension is active.
|
|
|
|
Effect:
|
|
- The model sees one lean synthesized checkpoint, not both the raw Pi summary artifacts and the extension's packet.
|
|
- The screenshot symptom of noisy summary text leaking into context should disappear.
|
|
|
|
### 2. Replace message-level pruning with turn-aware pruning
|
|
|
|
`src/prune.ts` should stop acting like a bulky-tool-result filter with a weak recent-turn suffix. It should prune by conversation turn.
|
|
|
|
#### Turn model
|
|
|
|
A turn is the contiguous slice beginning at a `user` message and including the assistant reply plus any following `toolResult` messages until the next `user` message.
|
|
|
|
Pruning rules:
|
|
- Keep only the most recent turn suffix, with the exact turn count controlled by policy and zone.
|
|
- Drop entire older turns instead of keeping their user and assistant messages forever.
|
|
- Within kept but older turns, distill bulky `toolResult` messages to short summaries.
|
|
- Keep the newest active turn lossless.
|
|
- Never keep a `toolResult` without the surrounding kept turn.
|
|
|
|
Why this shape:
|
|
- It preserves tool ordering.
|
|
- It prevents stale planning chatter from accumulating.
|
|
- It makes packet injection the main mechanism for carrying older context.
|
|
|
|
#### Policy shape
|
|
|
|
Current `Policy.recentUserTurns` remains the main knob, but zone adjustments become stronger.
|
|
|
|
Recommended practical behavior:
|
|
- Green: keep a small suffix of recent turns.
|
|
- Yellow: keep fewer turns and distill more aggressively.
|
|
- Red/compact: keep only the newest turn or the smallest safe suffix.
|
|
|
|
The exact numbers can be set during planning, but the direction is fixed: fewer full turns than today, with stronger tightening once pressure reaches yellow/red.
|
|
|
|
### 3. Make resume injection lean
|
|
|
|
`src/runtime.ts` currently builds resume text by prepending raw `lastCompactionSummary` and `lastBranchSummary`, then appending the ledger-based resume packet.
|
|
|
|
That should change.
|
|
|
|
New behavior:
|
|
- `buildResumePacket()` returns only the lean ledger-rendered restart packet.
|
|
- Raw persisted summary blobs remain stored in the snapshot for inspection and recovery, but they are not injected into model context.
|
|
|
|
Why:
|
|
- The ledger already extracts active goal, constraints, decisions, tasks, and blockers from summaries.
|
|
- Re-injecting the full raw summaries duplicates content and defeats pruning.
|
|
- Lean packet text is easier to bound and test.
|
|
|
|
### 4. Trigger compaction earlier from the extension
|
|
|
|
The extension should request compaction on its own once context pressure reaches the extension's red zone instead of waiting for Pi core's later `contextWindow - reserveTokens` check.
|
|
|
|
#### Trigger point
|
|
|
|
Use `turn_end` after usage is observed, because it provides the newest context measurement before the next model call.
|
|
|
|
#### Gate behavior
|
|
|
|
Compaction should trigger when:
|
|
- `ctx.getContextUsage()?.tokens` is known, and
|
|
- the runtime zone reaches `red` or worse under the extension's model-aware policy, and
|
|
- a local latch/cooldown says a compaction request is not already in flight for the same pressure episode.
|
|
|
|
Compaction should not spam:
|
|
- only fire on threshold crossing or after a clear reset,
|
|
- clear the latch after successful compaction or after tokens fall back below the trigger zone.
|
|
|
|
Reason:
|
|
- This is the earliest safe extension-controlled point that uses real usage data.
|
|
- It directly satisfies the requirement to compact earlier than Pi core.
|
|
|
|
### 5. Remove footer status noise
|
|
|
|
Remove `ctx.ui.setStatus("context-manager", ...)` updates.
|
|
|
|
Keep:
|
|
- `/ctx-status` for explicit inspection.
|
|
- Snapshot persistence and internal pressure tracking.
|
|
|
|
Do not keep:
|
|
- automatic footer text like `ctx green/yellow/red/compact`.
|
|
|
|
## Data flow after the change
|
|
|
|
1. `turn_end`
|
|
- Sync model context window.
|
|
- Observe actual context tokens.
|
|
- Persist snapshot.
|
|
- If the extension's red-zone compaction gate trips, request compaction immediately.
|
|
|
|
2. `session_compact` / `session_tree`
|
|
- Persist summary artifacts to snapshot state.
|
|
- Re-ingest them into the ledger.
|
|
- Arm one-shot resume injection.
|
|
|
|
3. `context`
|
|
- Remove raw `compactionSummary` and `branchSummary` messages from the outgoing message list.
|
|
- Prune remaining conversation by recent turn suffix.
|
|
- Distill bulky kept tool results when allowed by zone.
|
|
- Inject exactly one hidden lean packet or lean resume packet.
|
|
|
|
## Testing
|
|
|
|
### `src/prune.test.ts`
|
|
|
|
Add coverage for:
|
|
- dropping entire older turns instead of retaining their user and assistant messages,
|
|
- preserving the newest turn intact,
|
|
- distilling bulky tool results only inside older kept turns,
|
|
- stronger suffix tightening as zone worsens,
|
|
- preserving ordering safety for assistant/tool-result groups.
|
|
|
|
### `src/runtime.test.ts`
|
|
|
|
Add coverage for:
|
|
- `buildResumePacket()` no longer containing raw `## Latest compaction handoff` or `## Latest branch handoff` sections,
|
|
- lean resume output still containing current goal, task, constraints, decisions, and blockers extracted into the ledger.
|
|
|
|
### `src/extension.test.ts`
|
|
|
|
Add coverage for:
|
|
- filtering raw `compactionSummary` and `branchSummary` messages from the `context` payload,
|
|
- triggering extension-driven compaction once pressure reaches the red zone without waiting for Pi core's later threshold,
|
|
- not repeatedly triggering compaction while a latch/cooldown is active,
|
|
- no `context-manager` footer status writes,
|
|
- `/ctx-status` still reporting mode, zone, packet size, and summary presence.
|
|
|
|
## Acceptance criteria
|
|
|
|
- Live context no longer grows monotonically just because older user and assistant turns remain in place.
|
|
- The extension can compact before Pi core's reserve-token threshold.
|
|
- Hidden injected context no longer includes raw full compaction or branch-summary blobs.
|
|
- Raw summary artifact messages do not remain in live context after the extension has folded them into its own ledger.
|
|
- Footer status line from this package is gone.
|
|
- Existing deterministic ledger tie behavior remains unchanged.
|
|
|
|
## Risks and mitigations
|
|
|
|
### Risk: pruning drops too much recent context
|
|
|
|
Mitigation:
|
|
- keep the newest turn lossless,
|
|
- keep packet generation focused on active goal, constraints, decisions, tasks, blockers,
|
|
- cover turn-suffix behavior with targeted tests.
|
|
|
|
### Risk: early compaction triggers too often
|
|
|
|
Mitigation:
|
|
- use threshold-crossing plus latch/cooldown behavior,
|
|
- clear the latch only after compaction or a meaningful pressure drop,
|
|
- test repeated `turn_end` events around the boundary.
|
|
|
|
### Risk: summary filtering removes needed information
|
|
|
|
Mitigation:
|
|
- filter only raw `compactionSummary` and `branchSummary` message roles,
|
|
- keep summary-derived facts in the ledger,
|
|
- keep persisted raw summaries in snapshots for debug and recovery.
|