docs: add presets/background agents spec

2026-04-12 10:58:44 +01:00
parent 86335c2971
commit 0438a7b384
2 changed files with 1738 additions and 0 deletions
--- a/docs/superpowers/specs/2026-04-12-subagent-presets-and-background-agents-design.md
+++ b/docs/superpowers/specs/2026-04-12-subagent-presets-and-background-agents-design.md
@@ -0,0 +1,440 @@
+# Subagent presets and background agents design
+
+Date: 2026-04-12
+Status: approved for planning
+
+## Summary
+
+Evolve `pi-subagents` from generic foreground-only child runs into a preset-driven delegation package with three tools:
+- `subagent` for single or parallel foreground runs
+- `background_agent` for detached process-backed runs
+- `background_agent_status` for polling detached-run status and counts
+
+Named customization comes back as markdown presets, but **without** bundled built-in roles. Presets live in:
+- global: `~/.pi/agent/subagents/*.md`
+- project: nearest `.pi/subagents/*.md` found by walking up from the current `cwd`
+
+Project presets override global presets by name. Calls must name a preset. Per-call overrides are limited to `model`; prompt and tool access come only from the preset.
+
+## Current state
+
+Today the package has:
+- one `subagent` tool with `task | tasks | chain`
+- no background-run registry or polling tool
+- no preset discovery
+- no wrapper support for preset-owned prompt/tool restrictions
+- prompt templates that still assume `chain`
+
+The old role system was removed entirely, including markdown discovery, `--tools`, and `--append-system-prompt` wiring. This change brings back the useful customization pieces without reintroducing bundled roles or old role-specific behavior.
+
+## Goals
+
+- Remove `chain` mode from `subagent`.
+- Keep foreground single-run and parallel-run delegation.
+- Add named preset discovery from global and project markdown directories.
+- Let presets define appended prompt text, built-in tool allowlist, and optional default model.
+- Add `background_agent` as a process-only detached launcher.
+- Add `background_agent_status` so the parent agent can poll one run or many runs.
+- Track background-run counts and surface completion through UI notification plus a visible session message.
+- Preserve existing runner behavior for foreground runs and keep runner-specific changes minimal.
+
+## Non-goals
+
+- No bundled built-in presets (`scout`, `planner`, `reviewer`, `worker`, etc.).
+- No markdown discovery from `.pi/agents`.
+- No inline prompt or tool overrides on tool calls.
+- No background tmux mode. `background_agent` always uses the process runner.
+- No automatic follow-up turn when a background run finishes.
+- No attempt to restrict third-party extension tools beyond what Pi CLI already supports.
+
+## Chosen approach
+
+Add a small preset layer and a small background-run state layer on top of the current runner/wrapper core.
+
+- foreground `subagent` becomes preset-aware and loses `chain`
+- wrapper/artifacts regain preset-owned prompt/tool support
+- background runs reuse process-launch mechanics but skip `monitorRun()` in the calling tool
+- extension-owned registry watches detached runs, persists state to session custom entries, updates footer status text, emits UI notifications, and injects visible completion messages into session history
+
+This keeps most existing code paths intact while restoring customization and adding detached orchestration.
+
+## Public API design
+
+### `subagent`
+
+Supported modes:
+
+#### Single mode
+
+Required:
+- `preset`
+- `task`
+
+Optional:
+- `model`
+- `cwd`
+
+#### Parallel mode
+
+Required:
+- `tasks: Array<{ preset: string; task: string; model?: string; cwd?: string }>`
+
+Notes:
+- each parallel item names its own preset
+- there is no top-level default preset
+- there is no top-level required model
+
+Removed:
+- `chain`
+
+Model resolution order per run:
+1. call-level `model`
+2. preset `model`
+3. error: no model resolved
+
+### `background_agent`
+
+Single-run only.
+
+Required:
+- `preset`
+- `task`
+
+Optional:
+- `model`
+- `cwd`
+
+Behavior:
+- always launches with the process runner, ignoring tmux config
+- returns immediately after spawn request
+- returns run handle metadata plus counts snapshot
+- does not wait for completion
+
+### `background_agent_status`
+
+Purpose:
+- let the main agent poll background runs and inspect counts
+
+Parameters:
+- `runId?` — inspect one run
+- `includeCompleted?` — default `false`; when omitted, only active runs are listed unless `runId` is provided
+
+Returns:
+- counts: `running`, `completed`, `failed`, `aborted`, `total`
+- per-run rows with preset, task, cwd, model info, timestamps, artifact paths, and status
+- final result fields when a run is terminal
+
+## Preset design
+
+### Discovery
+
+Load presets from two sources:
+- global: `join(getAgentDir(), "subagents")`
+- project: nearest ancestor directory containing `.pi/subagents`
+
+Merge order:
+1. global presets
+2. project presets override global presets with the same `name`
+
+No confirmation gate for project presets.
+
+### File format
+
+Each preset is one markdown file with frontmatter and body.
+
+Required frontmatter:
+- `name`
+- `description`
+
+Optional frontmatter:
+- `model`
+- `tools`
+
+Body:
+- appended system prompt text
+
+Example:
+
+```md
+---
+name: repo-scout
+description: Fast repo exploration
+model: github-copilot/gpt-4o
+tools: read,grep,find,ls
+---
+You are a scout. Explore quickly, summarize clearly, and avoid implementation.
+```
+
+### Preset semantics
+
+- `model` is the default model when the call does not provide one.
+- `tools` is optional.
+- omitted `tools` means normal child-tool behavior (no built-in tool restriction)
+- when `tools` is present, pass it through Pi CLI `--tools`, which limits built-in tools only
+- prompt text comes only from the markdown body; no inline prompt override
+
+## Runtime design
+
+### Foreground subagent execution
+
+`src/tool.ts` becomes preset-aware.
+
+For each run:
+1. discover presets
+2. resolve the named preset
+3. normalize explicit `model` override against available models if present
+4. normalize preset default model against available models if used
+5. compute effective model from call override or preset default
+6. pass runner metadata including preset, prompt text, built-in tool allowlist, and model selection
+
+Parallel behavior stays the same apart from:
+- no `chain`
+- each task resolving its own preset/model
+- summary lines identifying tasks by index and/or preset, not old role names
+
+### Background execution
+
+`background_agent` launches the wrapper via process-runner primitives but returns immediately.
+
+Flow:
+1. resolve preset + effective model
+2. create run artifacts
+3. spawn wrapper process
+4. register run in extension background registry as `running`
+5. append persistent session entry for the new run
+6. start detached watcher on that run’s `result.json` / `events.jsonl`
+7. return handle metadata and counts snapshot
+
+Background runs are process-only even when normal `subagent` foreground runs are configured for tmux.
+
+### Background registry
+
+The extension owns a session-scoped registry keyed by `runId`.
+
+Stored metadata per run:
+- `runId`
+- `preset`
+- `task`
+- `cwd`
+- `requestedModel`
+- `resolvedModel`
+- artifact paths
+- timestamps
+- terminal result fields when available
+- status: `running | completed | failed | aborted`
+
+The registry also computes counts:
+- `running`
+- `completed`
+- `failed`
+- `aborted`
+- `total`
+
+### Persistence and reload behavior
+
+Persist background state with session custom entries.
+
+Custom entry types:
+- `pi-subagents:bg-run` — initial launch metadata
+- `pi-subagents:bg-update` — later status/result transitions
+
+On `session_start`, rebuild the in-memory registry by scanning `ctx.sessionManager.getEntries()`.
+
+For rebuilt runs that are still non-terminal:
+- if `result.json` already exists, ingest it immediately
+- otherwise reattach a watcher so completion still updates counts, notifications, and session messages after reload/resume
+
+## Notification and polling design
+
+### Completion notification
+
+When a detached run becomes terminal:
+1. update registry and counts
+2. append `pi-subagents:bg-update`
+3. update footer status text, e.g. `bg: 2 running / 5 total`
+4. emit UI notification if UI is available
+5. inject a visible custom session message describing completion
+
+The completion message must **not** trigger a new agent turn automatically.
+
+### Polling
+
+The parent agent polls with `background_agent_status`.
+
+Typical use:
+- ask for current running count
+- list active background runs
+- inspect one `runId`
+- fetch terminal result summary after notification arrives
+
+## Wrapper and artifact design
+
+### Artifact layout
+
+Keep run directories under:
+- `.pi/subagents/runs/<runId>/`
+
+Keep existing files and restore the removed prompt artifact when needed:
+- `meta.json`
+- `events.jsonl`
+- `result.json`
+- `stdout.log`
+- `stderr.log`
+- `transcript.log`
+- `child-session.jsonl`
+- `system-prompt.md`
+
+### Metadata
+
+Keep existing generic bookkeeping and add preset-specific fields:
+- `preset`
+- `presetSource`
+- `tools`
+- `systemPrompt`
+- `systemPromptPath`
+
+Do not reintroduce old bundled-role concepts or role-only behavior.
+
+### Child wrapper
+
+`src/wrapper/cli.mjs` should again support:
+- `--append-system-prompt <path>` when preset prompt text exists
+- `--tools <csv>` when preset `tools` exists
+
+Keep:
+- `PI_SUBAGENTS_CHILD=1`
+- github-copilot initiator behavior based on effective model
+- best-effort artifact appends that must never block writing `result.json`
+- semantic-completion exit handling
+
+## Extension registration behavior
+
+Keep existing model-registration behavior for model-dependent tools:
+- preserve current available-model order for schema enums
+- do not mutate available-model arrays when deduping cache keys
+- re-register when model set changes
+- do not re-register when model set is the same in different order
+- if the first observed set is empty, a later non-empty set must still register
+- skip tool registration entirely when `PI_SUBAGENTS_CHILD=1`
+
+Register:
+- `subagent`
+- `background_agent`
+- `background_agent_status`
+
+`background_agent_status` does not need model parameters, but registration still follows the package’s existing model-availability gate for consistency.
+
+## Prompt and documentation design
+
+Rewrite shipped prompts so they no longer mention `chain` mode.
+
+They should instead describe repeated `subagent` calls or `subagent.tasks` parallel calls, for example:
+- inspect with a preset
+- turn findings into a plan with another preset
+- implement or review in separate follow-up calls
+
+README and docs should describe:
+- preset directories and markdown format
+- `background_agent`
+- `background_agent_status`
+- background completion notification behavior
+- background count tracking
+- process-only behavior for detached runs
+- built-in-tool-only semantics of preset `tools`
+
+They should continue to avoid claiming bundled built-in roles.
+
+## File-level impact
+
+Expected new files:
+- `src/presets.ts`
+- `src/presets.test.ts`
+- `src/background-registry.ts`
+- `src/background-registry.test.ts`
+- `src/background-schema.ts`
+- `src/background-tool.ts`
+- `src/background-tool.test.ts`
+- `src/background-status-tool.ts`
+- `src/background-status-tool.test.ts`
+
+Expected modified files:
+- `index.ts`
+- `src/schema.ts`
+- `src/tool.ts`
+- `src/models.ts`
+- `src/runner.ts` and/or `src/process-runner.ts`
+- `src/artifacts.ts`
+- `src/process-runner.test.ts`
+- `src/extension.test.ts`
+- `src/artifacts.test.ts`
+- `src/wrapper/cli.mjs`
+- `src/wrapper/cli.test.ts`
+- `src/wrapper/render.mjs`
+- `src/wrapper/render.test.ts`
+- `src/prompts.test.ts`
+- `README.md`
+- `prompts/*.md`
+- `AGENTS.md`
+
+Expected removals:
+- `src/tool-chain.test.ts`
+
+## Testing plan
+
+Implementation should follow TDD.
+
+### New coverage
+
+Add tests for:
+- preset discovery from global and project directories
+- project preset override by name
+- required preset selection in single and parallel mode
+- model resolution from call override vs preset default
+- error when neither call nor preset supplies a valid model
+- chain removal from schema and runtime
+- detached background launch returning immediately
+- background registry counts
+- session-entry persistence and reload reconstruction
+- completion notification emitting UI notify + visible session message without auto-turn
+- polling one background run and many background runs
+
+### Preserved coverage
+
+Keep regression coverage for:
+- child sessions skipping subagent tool registration when `PI_SUBAGENTS_CHILD=1`
+- no tool registration when no models are available
+- later registration when a non-empty model list appears
+- no re-registration for the same model set in different order
+- re-registration when the model set changes
+- github-copilot initiator behavior
+- best-effort artifact logging never preventing `result.json` writes
+- effective model using the resolved model when requested/resolved differ
+
+## Risks and mitigations
+
+### Risk: preset `tools` sounds broader than Pi can enforce
+
+Mitigation:
+- document that preset `tools` maps to Pi CLI `--tools`
+- treat it as a built-in tool allowlist only
+- keep `PI_SUBAGENTS_CHILD=1` so this package never re-registers its own subagent tools inside child runs
+
+### Risk: detached watcher state lost on reload
+
+Mitigation:
+- persist launch/update events as session custom entries
+- rebuild registry on `session_start`
+- reattach watchers for non-terminal runs
+
+### Risk: background notifications spam the user
+
+Mitigation:
+- emit one terminal notification per run
+- keep footer count compact
+- require explicit polling for detailed inspection
+
+### Risk: larger extension entrypoint
+
+Mitigation:
+- keep preset discovery, registry/state, and tool definitions in separate focused modules
+- keep runner-specific logic in existing runner files with minimal changes