docs: add generic subagents design spec

2026-04-12 06:38:34 +01:00
parent 8f69569b8f
commit b5474d064b
1 changed files with 286 additions and 0 deletions
--- a/docs/superpowers/specs/2026-04-12-generic-subagents-design.md
+++ b/docs/superpowers/specs/2026-04-12-generic-subagents-design.md
@@ -0,0 +1,286 @@
+# Generic subagents design
+
+Date: 2026-04-12
+Status: approved for planning
+
+## Summary
+
+Simplify `pi-subagents` from named, specialized child agents into a single generic `subagent` concept.
+
+A subagent run should be a normal Pi child session that receives:
+- a delegated `task`
+- a selected `model`
+- an optional `cwd`
+
+It should **not** receive a role-specific system prompt, role-specific tool restrictions, built-in role behavior, or markdown-discovered agent definitions.
+
+This is an intentional breaking API cleanup, not a compatibility shim.
+
+## Current state
+
+Today the package supports:
+- built-in named roles in `src/builtin-agents.ts` (`scout`, `planner`, `reviewer`, `worker`)
+- optional markdown-discovered user/project agents in `src/agents.ts`
+- role-derived `systemPrompt`, `tools`, and optional default `model`
+- prompt templates in `prompts/` that explicitly chain those named roles
+
+That surface no longer matches the desired product behavior.
+
+## Goals
+
+- Make every child run a generic Pi agent.
+- Remove named/specialized agent concepts from the package API and runtime.
+- Preserve the existing runner, wrapper, model-registration, and artifact mechanics where they are still valid.
+- Keep parallel and chain execution modes.
+- Keep explicit model selection and validation.
+
+## Non-goals
+
+- No compatibility layer that silently accepts old role names.
+- No replacement specialization layer.
+- No change to process-vs-tmux runner selection.
+- No change to package identity or env names (`PI_SUBAGENTS_*`).
+
+## Chosen approach
+
+Use a hard simplification:
+- remove named agent definitions entirely
+- remove markdown agent discovery entirely
+- make child runs task/model/cwd only
+- rewrite workflow prompts so they describe generic multi-step delegation rather than named roles
+
+This matches the intended behavior exactly and removes misleading concepts instead of preserving dead abstractions.
+
+## Public API design
+
+### Tool modes
+
+#### Single mode
+
+Required:
+- `task`
+- `model`
+
+Optional:
+- `cwd`
+
+#### Parallel mode
+
+Required:
+- top-level `model`
+- `tasks: Array<{ task: string; model?: string; cwd?: string }>`
+
+The top-level `model` remains the default for tasks that do not specify `model`.
+
+#### Chain mode
+
+Required:
+- top-level `model`
+- `chain: Array<{ task: string; model?: string; cwd?: string }>`
+
+`{previous}` substitution remains supported.
+
+### Removed fields
+
+Remove from the tool schema:
+- `agent`
+- `tasks[].agent`
+- `chain[].agent`
+- `agentScope`
+- `confirmProjectAgents`
+
+### Model fields
+
+Keep:
+- required top-level `model`
+- optional per-task/per-step `model`
+- optional per-task/per-step `cwd`
+
+Keep the current schema descriptions that explain models must come from the currently available models.
+
+## Runtime design
+
+### Tool execution
+
+`src/tool.ts` should stop discovering agents entirely.
+
+Instead, each run is built directly from the requested task item:
+- resolve the effective model from `task.model ?? params.model`
+- pass generic run metadata to the configured runner
+- do not attach `systemPrompt`
+- do not attach `tools`
+
+Parallel and chain execution semantics stay the same apart from the simplified payload shape.
+
+### Child wrapper
+
+`src/wrapper/cli.mjs` remains a generic Pi launcher.
+
+Keep:
+- `pi --mode json --session ... --model ... <task>`
+- effective model selection based on resolved model when present
+- `PI_SUBAGENTS_CHILD=1` on every child run
+- GitHub Copilot initiator behavior only for effective models under `github-copilot/*`
+- best-effort artifact appends that must never block writing `result.json`
+
+Remove:
+- `--append-system-prompt`
+- any dependence on agent-role metadata
+
+### Runner behavior
+
+Keep process runner and tmux runner behavior unchanged except for the smaller metadata payload.
+
+No change to:
+- process runner as default
+- tmux as opt-in
+- tmux requiring `tmux` on `PATH` only when the tmux runner is selected
+- result monitoring and artifact paths
+
+## Metadata and artifact design
+
+Keep stable run bookkeeping fields such as:
+- `runId`
+- `mode`
+- `taskIndex`
+- `step`
+- `task`
+- `cwd`
+- `requestedModel`
+- `resolvedModel`
+- `sessionPath`
+- artifact paths
+- exit status and stop-reason fields
+
+Remove role-derived fields from runtime payloads and outputs:
+- `agent`
+- `agentSource`
+- `systemPrompt`
+- `systemPromptPath`
+- any role-only transcript rendering
+
+Transcript headers should stay generic and continue to show useful run metadata, but should no longer print `Agent: ...`.
+
+Artifact directory structure should stay stable under `.pi/subagents/runs/<runId>/` unless a removed role-specific file is no longer needed.
+
+## Extension registration behavior
+
+Keep the current model-registration behavior exactly:
+- do not register the tool when no models are available
+- preserve original available-model order for schema enums
+- dedupe registration using an order-insensitive lowercase-sorted copy for the cache key
+- re-register when the model set changes
+- do not re-register when the model set is the same in a different order
+- if the first observed set is empty, a later non-empty set must still register
+
+Keep the child-session guard in `index.ts`:
+- provider override may still register first
+- subagent tool registration must still be skipped when `PI_SUBAGENTS_CHILD=1`
+
+## Prompt and documentation design
+
+Keep the shipped workflow prompt templates, but rewrite them as generic orchestration helpers.
+
+They must no longer reference:
+- `scout`
+- `planner`
+- `reviewer`
+- `worker`
+
+Instead, they should instruct the parent agent to run generic subagents whose tasks describe the intended step, for example:
+- inspect
+- plan
+- implement
+- review
+- revise
+
+README and package-facing docs should describe:
+- generic child Pi sessions
+- the unchanged runner options
+- the simplified subagent concept
+
+They should stop describing specialized built-in roles or markdown agent discovery.
+
+## File-level impact
+
+Expected primary runtime changes:
+- modify `src/schema.ts`
+- modify `src/tool.ts`
+- modify `src/artifacts.ts`
+- modify `src/process-runner.ts`
+- modify `src/wrapper/cli.mjs`
+- modify `src/wrapper/render.mjs`
+- modify `README.md`
+- modify `prompts/*.md`
+
+Expected removals if implementation chooses the full cleanup path:
+- `src/builtin-agents.ts`
+- `src/agents.ts`
+- agent-discovery-specific tests
+
+`index.ts`, `src/process-runner.ts`, and `src/tmux-runner.ts` should otherwise keep the smallest possible runner-specific changes.
+
+## Testing plan
+
+Implementation should follow TDD:
+1. write or update the failing tests first
+2. verify the failures are for the intended reason
+3. implement the minimal runtime changes
+4. re-run targeted tests
+5. run full `npm test`
+
+### Tests to preserve
+
+Keep dedicated coverage for:
+- every child run setting `PI_SUBAGENTS_CHILD=1`
+- non-copilot models not receiving the copilot initiator flag
+- the effective model using the resolved model when requested/resolved differ
+- best-effort artifact logging never preventing `result.json` from being written
+- no registration on empty model lists
+- later registration on a non-empty model list
+- no re-registration for the same model set in different orders
+- re-registration when the model set changes
+
+### Tests to rewrite
+
+- `src/tool.test.ts`: generic task payloads, no named agents
+- `src/wrapper/render.test.ts`: no `Agent: scout` assertion
+- `src/artifacts.test.ts` and `src/process-runner.test.ts`: no role/system-prompt assumptions
+- `src/prompts.test.ts`: prompts still ship but no longer rely on named roles
+- README/package tests: documentation still accurate after the simplification
+
+### Tests to remove if modules are removed
+
+- `src/agents.test.ts`
+
+## Risks and mitigations
+
+### Risk: breaking downstream callers
+
+Mitigation:
+- treat the change as explicit API breakage
+- update shipped prompts/docs/tests in the same change
+- avoid a fake compatibility layer that hides the break
+
+### Risk: over-editing runner code
+
+Mitigation:
+- keep changes focused on schema, tool execution, prompt handling, and artifact metadata
+- preserve shared runner behavior unless role metadata is the only reason a field exists
+
+### Risk: stale role wording in tests/docs
+
+Mitigation:
+- explicitly scrub `scout`, `planner`, `reviewer`, and `worker` from prompts and user-facing tests
+- update transcript/header expectations to generic wording
+
+## Acceptance criteria
+
+The design is satisfied when:
+- the `subagent` tool no longer requires or accepts agent names
+- child runs are launched as normal Pi sessions with task/model/cwd only
+- no custom system prompt is injected into child runs
+- no agent discovery logic remains in the runtime path
+- model validation and registration behavior still passes all existing regressions
+- shipped prompts/docs describe generic subagents only
+- tests no longer rely on the removed role names or prompt files for behavior