# Context manager: stronger pruning and earlier compaction ## Status Approved for planning. ## Problem `pi-context-manager` loads, but it does not materially cap live context growth. Observed behavior: - Context usage keeps climbing during long sessions. - The extension mostly distills old bulky `toolResult` output, but it keeps most older user and assistant turns intact. - Compaction still waits for Pi core's later reserve-token threshold. - Raw compaction and branch-summary artifacts can remain in live context even though the extension already persists and replays the same information through its own ledger. - Footer status noise is unnecessary for this package. ## Goals 1. Keep live context flatter during long sessions. 2. Trigger compaction earlier than Pi core's default reserve-token threshold. 3. Use lean resume injection, not raw latest compaction or branch-summary blobs. 4. Remove the persistent footer status line. 5. Preserve deterministic memory merging behavior, including stable exact-timestamp tie resolution. ## Non-goals - Adding new LLM-facing tools. - Broad extractor or summarizer redesign unrelated to live-context pressure. - Editing `docs/extensions.md`. - Changing the existing deterministic same-slot tie-break contract in `src/ledger.ts`. ## Constraints - Do not wait for Pi core's much later reserve-token threshold. - Resume injection should use lean packet text, not raw latest compaction or branch-summary blobs. - Exact-timestamp ties must resolve the same way regardless of candidate processing order. - Keep changes package-local and minimal. ## Design ### 1. Pre-filter raw summary artifacts from live context Before turn-aware pruning runs, the extension should drop raw `compactionSummary` and `branchSummary` messages from the `context` event payload. Reason: - These artifacts are already persisted. - They are already replayed into the extension ledger. - A separate hidden packet/resume message is already injected. - Keeping the raw artifacts in live context duplicates information and allows context growth even when the extension is active. Effect: - The model sees one lean synthesized checkpoint, not both the raw Pi summary artifacts and the extension's packet. - The screenshot symptom of noisy summary text leaking into context should disappear. ### 2. Replace message-level pruning with turn-aware pruning `src/prune.ts` should stop acting like a bulky-tool-result filter with a weak recent-turn suffix. It should prune by conversation turn. #### Turn model A turn is the contiguous slice beginning at a `user` message and including the assistant reply plus any following `toolResult` messages until the next `user` message. Pruning rules: - Keep only the most recent turn suffix, with the exact turn count controlled by policy and zone. - Drop entire older turns instead of keeping their user and assistant messages forever. - Within kept but older turns, distill bulky `toolResult` messages to short summaries. - Keep the newest active turn lossless. - Never keep a `toolResult` without the surrounding kept turn. Why this shape: - It preserves tool ordering. - It prevents stale planning chatter from accumulating. - It makes packet injection the main mechanism for carrying older context. #### Policy shape Current `Policy.recentUserTurns` remains the main knob, but zone adjustments become stronger. Recommended practical behavior: - Green: keep a small suffix of recent turns. - Yellow: keep fewer turns and distill more aggressively. - Red/compact: keep only the newest turn or the smallest safe suffix. The exact numbers can be set during planning, but the direction is fixed: fewer full turns than today, with stronger tightening once pressure reaches yellow/red. ### 3. Make resume injection lean `src/runtime.ts` currently builds resume text by prepending raw `lastCompactionSummary` and `lastBranchSummary`, then appending the ledger-based resume packet. That should change. New behavior: - `buildResumePacket()` returns only the lean ledger-rendered restart packet. - Raw persisted summary blobs remain stored in the snapshot for inspection and recovery, but they are not injected into model context. Why: - The ledger already extracts active goal, constraints, decisions, tasks, and blockers from summaries. - Re-injecting the full raw summaries duplicates content and defeats pruning. - Lean packet text is easier to bound and test. ### 4. Trigger compaction earlier from the extension The extension should request compaction on its own once context pressure reaches the extension's red zone instead of waiting for Pi core's later `contextWindow - reserveTokens` check. #### Trigger point Use `turn_end` after usage is observed, because it provides the newest context measurement before the next model call. #### Gate behavior Compaction should trigger when: - `ctx.getContextUsage()?.tokens` is known, and - the runtime zone reaches `red` or worse under the extension's model-aware policy, and - a local latch/cooldown says a compaction request is not already in flight for the same pressure episode. Compaction should not spam: - only fire on threshold crossing or after a clear reset, - clear the latch after successful compaction or after tokens fall back below the trigger zone. Reason: - This is the earliest safe extension-controlled point that uses real usage data. - It directly satisfies the requirement to compact earlier than Pi core. ### 5. Remove footer status noise Remove `ctx.ui.setStatus("context-manager", ...)` updates. Keep: - `/ctx-status` for explicit inspection. - Snapshot persistence and internal pressure tracking. Do not keep: - automatic footer text like `ctx green/yellow/red/compact`. ## Data flow after the change 1. `turn_end` - Sync model context window. - Observe actual context tokens. - Persist snapshot. - If the extension's red-zone compaction gate trips, request compaction immediately. 2. `session_compact` / `session_tree` - Persist summary artifacts to snapshot state. - Re-ingest them into the ledger. - Arm one-shot resume injection. 3. `context` - Remove raw `compactionSummary` and `branchSummary` messages from the outgoing message list. - Prune remaining conversation by recent turn suffix. - Distill bulky kept tool results when allowed by zone. - Inject exactly one hidden lean packet or lean resume packet. ## Testing ### `src/prune.test.ts` Add coverage for: - dropping entire older turns instead of retaining their user and assistant messages, - preserving the newest turn intact, - distilling bulky tool results only inside older kept turns, - stronger suffix tightening as zone worsens, - preserving ordering safety for assistant/tool-result groups. ### `src/runtime.test.ts` Add coverage for: - `buildResumePacket()` no longer containing raw `## Latest compaction handoff` or `## Latest branch handoff` sections, - lean resume output still containing current goal, task, constraints, decisions, and blockers extracted into the ledger. ### `src/extension.test.ts` Add coverage for: - filtering raw `compactionSummary` and `branchSummary` messages from the `context` payload, - triggering extension-driven compaction once pressure reaches the red zone without waiting for Pi core's later threshold, - not repeatedly triggering compaction while a latch/cooldown is active, - no `context-manager` footer status writes, - `/ctx-status` still reporting mode, zone, packet size, and summary presence. ## Acceptance criteria - Live context no longer grows monotonically just because older user and assistant turns remain in place. - The extension can compact before Pi core's reserve-token threshold. - Hidden injected context no longer includes raw full compaction or branch-summary blobs. - Raw summary artifact messages do not remain in live context after the extension has folded them into its own ledger. - Footer status line from this package is gone. - Existing deterministic ledger tie behavior remains unchanged. ## Risks and mitigations ### Risk: pruning drops too much recent context Mitigation: - keep the newest turn lossless, - keep packet generation focused on active goal, constraints, decisions, tasks, blockers, - cover turn-suffix behavior with targeted tests. ### Risk: early compaction triggers too often Mitigation: - use threshold-crossing plus latch/cooldown behavior, - clear the latch only after compaction or a meaningful pressure drop, - test repeated `turn_end` events around the boundary. ### Risk: summary filtering removes needed information Mitigation: - filter only raw `compactionSummary` and `branchSummary` message roles, - keep summary-derived facts in the ledger, - keep persisted raw summaries in snapshots for debug and recovery.