diff --git a/docs/specs/2026-04-12-context-manager-design.md b/docs/specs/2026-04-12-context-manager-design.md new file mode 100644 index 0000000..ece57e3 --- /dev/null +++ b/docs/specs/2026-04-12-context-manager-design.md @@ -0,0 +1,211 @@ +# Context manager: stronger pruning and earlier compaction + +## Status + +Approved for planning. + +## Problem + +`pi-context-manager` loads, but it does not materially cap live context growth. + +Observed behavior: +- Context usage keeps climbing during long sessions. +- The extension mostly distills old bulky `toolResult` output, but it keeps most older user and assistant turns intact. +- Compaction still waits for Pi core's later reserve-token threshold. +- Raw compaction and branch-summary artifacts can remain in live context even though the extension already persists and replays the same information through its own ledger. +- Footer status noise is unnecessary for this package. + +## Goals + +1. Keep live context flatter during long sessions. +2. Trigger compaction earlier than Pi core's default reserve-token threshold. +3. Use lean resume injection, not raw latest compaction or branch-summary blobs. +4. Remove the persistent footer status line. +5. Preserve deterministic memory merging behavior, including stable exact-timestamp tie resolution. + +## Non-goals + +- Adding new LLM-facing tools. +- Broad extractor or summarizer redesign unrelated to live-context pressure. +- Editing `docs/extensions.md`. +- Changing the existing deterministic same-slot tie-break contract in `src/ledger.ts`. + +## Constraints + +- Do not wait for Pi core's much later reserve-token threshold. +- Resume injection should use lean packet text, not raw latest compaction or branch-summary blobs. +- Exact-timestamp ties must resolve the same way regardless of candidate processing order. +- Keep changes package-local and minimal. + +## Design + +### 1. Pre-filter raw summary artifacts from live context + +Before turn-aware pruning runs, the extension should drop raw `compactionSummary` and `branchSummary` messages from the `context` event payload. + +Reason: +- These artifacts are already persisted. +- They are already replayed into the extension ledger. +- A separate hidden packet/resume message is already injected. +- Keeping the raw artifacts in live context duplicates information and allows context growth even when the extension is active. + +Effect: +- The model sees one lean synthesized checkpoint, not both the raw Pi summary artifacts and the extension's packet. +- The screenshot symptom of noisy summary text leaking into context should disappear. + +### 2. Replace message-level pruning with turn-aware pruning + +`src/prune.ts` should stop acting like a bulky-tool-result filter with a weak recent-turn suffix. It should prune by conversation turn. + +#### Turn model + +A turn is the contiguous slice beginning at a `user` message and including the assistant reply plus any following `toolResult` messages until the next `user` message. + +Pruning rules: +- Keep only the most recent turn suffix, with the exact turn count controlled by policy and zone. +- Drop entire older turns instead of keeping their user and assistant messages forever. +- Within kept but older turns, distill bulky `toolResult` messages to short summaries. +- Keep the newest active turn lossless. +- Never keep a `toolResult` without the surrounding kept turn. + +Why this shape: +- It preserves tool ordering. +- It prevents stale planning chatter from accumulating. +- It makes packet injection the main mechanism for carrying older context. + +#### Policy shape + +Current `Policy.recentUserTurns` remains the main knob, but zone adjustments become stronger. + +Recommended practical behavior: +- Green: keep a small suffix of recent turns. +- Yellow: keep fewer turns and distill more aggressively. +- Red/compact: keep only the newest turn or the smallest safe suffix. + +The exact numbers can be set during planning, but the direction is fixed: fewer full turns than today, with stronger tightening once pressure reaches yellow/red. + +### 3. Make resume injection lean + +`src/runtime.ts` currently builds resume text by prepending raw `lastCompactionSummary` and `lastBranchSummary`, then appending the ledger-based resume packet. + +That should change. + +New behavior: +- `buildResumePacket()` returns only the lean ledger-rendered restart packet. +- Raw persisted summary blobs remain stored in the snapshot for inspection and recovery, but they are not injected into model context. + +Why: +- The ledger already extracts active goal, constraints, decisions, tasks, and blockers from summaries. +- Re-injecting the full raw summaries duplicates content and defeats pruning. +- Lean packet text is easier to bound and test. + +### 4. Trigger compaction earlier from the extension + +The extension should request compaction on its own once context pressure reaches the extension's red zone instead of waiting for Pi core's later `contextWindow - reserveTokens` check. + +#### Trigger point + +Use `turn_end` after usage is observed, because it provides the newest context measurement before the next model call. + +#### Gate behavior + +Compaction should trigger when: +- `ctx.getContextUsage()?.tokens` is known, and +- the runtime zone reaches `red` or worse under the extension's model-aware policy, and +- a local latch/cooldown says a compaction request is not already in flight for the same pressure episode. + +Compaction should not spam: +- only fire on threshold crossing or after a clear reset, +- clear the latch after successful compaction or after tokens fall back below the trigger zone. + +Reason: +- This is the earliest safe extension-controlled point that uses real usage data. +- It directly satisfies the requirement to compact earlier than Pi core. + +### 5. Remove footer status noise + +Remove `ctx.ui.setStatus("context-manager", ...)` updates. + +Keep: +- `/ctx-status` for explicit inspection. +- Snapshot persistence and internal pressure tracking. + +Do not keep: +- automatic footer text like `ctx green/yellow/red/compact`. + +## Data flow after the change + +1. `turn_end` + - Sync model context window. + - Observe actual context tokens. + - Persist snapshot. + - If the extension's red-zone compaction gate trips, request compaction immediately. + +2. `session_compact` / `session_tree` + - Persist summary artifacts to snapshot state. + - Re-ingest them into the ledger. + - Arm one-shot resume injection. + +3. `context` + - Remove raw `compactionSummary` and `branchSummary` messages from the outgoing message list. + - Prune remaining conversation by recent turn suffix. + - Distill bulky kept tool results when allowed by zone. + - Inject exactly one hidden lean packet or lean resume packet. + +## Testing + +### `src/prune.test.ts` + +Add coverage for: +- dropping entire older turns instead of retaining their user and assistant messages, +- preserving the newest turn intact, +- distilling bulky tool results only inside older kept turns, +- stronger suffix tightening as zone worsens, +- preserving ordering safety for assistant/tool-result groups. + +### `src/runtime.test.ts` + +Add coverage for: +- `buildResumePacket()` no longer containing raw `## Latest compaction handoff` or `## Latest branch handoff` sections, +- lean resume output still containing current goal, task, constraints, decisions, and blockers extracted into the ledger. + +### `src/extension.test.ts` + +Add coverage for: +- filtering raw `compactionSummary` and `branchSummary` messages from the `context` payload, +- triggering extension-driven compaction once pressure reaches the red zone without waiting for Pi core's later threshold, +- not repeatedly triggering compaction while a latch/cooldown is active, +- no `context-manager` footer status writes, +- `/ctx-status` still reporting mode, zone, packet size, and summary presence. + +## Acceptance criteria + +- Live context no longer grows monotonically just because older user and assistant turns remain in place. +- The extension can compact before Pi core's reserve-token threshold. +- Hidden injected context no longer includes raw full compaction or branch-summary blobs. +- Raw summary artifact messages do not remain in live context after the extension has folded them into its own ledger. +- Footer status line from this package is gone. +- Existing deterministic ledger tie behavior remains unchanged. + +## Risks and mitigations + +### Risk: pruning drops too much recent context + +Mitigation: +- keep the newest turn lossless, +- keep packet generation focused on active goal, constraints, decisions, tasks, blockers, +- cover turn-suffix behavior with targeted tests. + +### Risk: early compaction triggers too often + +Mitigation: +- use threshold-crossing plus latch/cooldown behavior, +- clear the latch only after compaction or a meaningful pressure drop, +- test repeated `turn_end` events around the boundary. + +### Risk: summary filtering removes needed information + +Mitigation: +- filter only raw `compactionSummary` and `branchSummary` message roles, +- keep summary-derived facts in the ledger, +- keep persisted raw summaries in snapshots for debug and recovery.