diff --git a/docs/superpowers/specs/2026-04-12-generic-subagents-design.md b/docs/superpowers/specs/2026-04-12-generic-subagents-design.md new file mode 100644 index 0000000..21c4425 --- /dev/null +++ b/docs/superpowers/specs/2026-04-12-generic-subagents-design.md @@ -0,0 +1,286 @@ +# Generic subagents design + +Date: 2026-04-12 +Status: approved for planning + +## Summary + +Simplify `pi-subagents` from named, specialized child agents into a single generic `subagent` concept. + +A subagent run should be a normal Pi child session that receives: +- a delegated `task` +- a selected `model` +- an optional `cwd` + +It should **not** receive a role-specific system prompt, role-specific tool restrictions, built-in role behavior, or markdown-discovered agent definitions. + +This is an intentional breaking API cleanup, not a compatibility shim. + +## Current state + +Today the package supports: +- built-in named roles in `src/builtin-agents.ts` (`scout`, `planner`, `reviewer`, `worker`) +- optional markdown-discovered user/project agents in `src/agents.ts` +- role-derived `systemPrompt`, `tools`, and optional default `model` +- prompt templates in `prompts/` that explicitly chain those named roles + +That surface no longer matches the desired product behavior. + +## Goals + +- Make every child run a generic Pi agent. +- Remove named/specialized agent concepts from the package API and runtime. +- Preserve the existing runner, wrapper, model-registration, and artifact mechanics where they are still valid. +- Keep parallel and chain execution modes. +- Keep explicit model selection and validation. + +## Non-goals + +- No compatibility layer that silently accepts old role names. +- No replacement specialization layer. +- No change to process-vs-tmux runner selection. +- No change to package identity or env names (`PI_SUBAGENTS_*`). + +## Chosen approach + +Use a hard simplification: +- remove named agent definitions entirely +- remove markdown agent discovery entirely +- make child runs task/model/cwd only +- rewrite workflow prompts so they describe generic multi-step delegation rather than named roles + +This matches the intended behavior exactly and removes misleading concepts instead of preserving dead abstractions. + +## Public API design + +### Tool modes + +#### Single mode + +Required: +- `task` +- `model` + +Optional: +- `cwd` + +#### Parallel mode + +Required: +- top-level `model` +- `tasks: Array<{ task: string; model?: string; cwd?: string }>` + +The top-level `model` remains the default for tasks that do not specify `model`. + +#### Chain mode + +Required: +- top-level `model` +- `chain: Array<{ task: string; model?: string; cwd?: string }>` + +`{previous}` substitution remains supported. + +### Removed fields + +Remove from the tool schema: +- `agent` +- `tasks[].agent` +- `chain[].agent` +- `agentScope` +- `confirmProjectAgents` + +### Model fields + +Keep: +- required top-level `model` +- optional per-task/per-step `model` +- optional per-task/per-step `cwd` + +Keep the current schema descriptions that explain models must come from the currently available models. + +## Runtime design + +### Tool execution + +`src/tool.ts` should stop discovering agents entirely. + +Instead, each run is built directly from the requested task item: +- resolve the effective model from `task.model ?? params.model` +- pass generic run metadata to the configured runner +- do not attach `systemPrompt` +- do not attach `tools` + +Parallel and chain execution semantics stay the same apart from the simplified payload shape. + +### Child wrapper + +`src/wrapper/cli.mjs` remains a generic Pi launcher. + +Keep: +- `pi --mode json --session ... --model ... ` +- effective model selection based on resolved model when present +- `PI_SUBAGENTS_CHILD=1` on every child run +- GitHub Copilot initiator behavior only for effective models under `github-copilot/*` +- best-effort artifact appends that must never block writing `result.json` + +Remove: +- `--append-system-prompt` +- any dependence on agent-role metadata + +### Runner behavior + +Keep process runner and tmux runner behavior unchanged except for the smaller metadata payload. + +No change to: +- process runner as default +- tmux as opt-in +- tmux requiring `tmux` on `PATH` only when the tmux runner is selected +- result monitoring and artifact paths + +## Metadata and artifact design + +Keep stable run bookkeeping fields such as: +- `runId` +- `mode` +- `taskIndex` +- `step` +- `task` +- `cwd` +- `requestedModel` +- `resolvedModel` +- `sessionPath` +- artifact paths +- exit status and stop-reason fields + +Remove role-derived fields from runtime payloads and outputs: +- `agent` +- `agentSource` +- `systemPrompt` +- `systemPromptPath` +- any role-only transcript rendering + +Transcript headers should stay generic and continue to show useful run metadata, but should no longer print `Agent: ...`. + +Artifact directory structure should stay stable under `.pi/subagents/runs//` unless a removed role-specific file is no longer needed. + +## Extension registration behavior + +Keep the current model-registration behavior exactly: +- do not register the tool when no models are available +- preserve original available-model order for schema enums +- dedupe registration using an order-insensitive lowercase-sorted copy for the cache key +- re-register when the model set changes +- do not re-register when the model set is the same in a different order +- if the first observed set is empty, a later non-empty set must still register + +Keep the child-session guard in `index.ts`: +- provider override may still register first +- subagent tool registration must still be skipped when `PI_SUBAGENTS_CHILD=1` + +## Prompt and documentation design + +Keep the shipped workflow prompt templates, but rewrite them as generic orchestration helpers. + +They must no longer reference: +- `scout` +- `planner` +- `reviewer` +- `worker` + +Instead, they should instruct the parent agent to run generic subagents whose tasks describe the intended step, for example: +- inspect +- plan +- implement +- review +- revise + +README and package-facing docs should describe: +- generic child Pi sessions +- the unchanged runner options +- the simplified subagent concept + +They should stop describing specialized built-in roles or markdown agent discovery. + +## File-level impact + +Expected primary runtime changes: +- modify `src/schema.ts` +- modify `src/tool.ts` +- modify `src/artifacts.ts` +- modify `src/process-runner.ts` +- modify `src/wrapper/cli.mjs` +- modify `src/wrapper/render.mjs` +- modify `README.md` +- modify `prompts/*.md` + +Expected removals if implementation chooses the full cleanup path: +- `src/builtin-agents.ts` +- `src/agents.ts` +- agent-discovery-specific tests + +`index.ts`, `src/process-runner.ts`, and `src/tmux-runner.ts` should otherwise keep the smallest possible runner-specific changes. + +## Testing plan + +Implementation should follow TDD: +1. write or update the failing tests first +2. verify the failures are for the intended reason +3. implement the minimal runtime changes +4. re-run targeted tests +5. run full `npm test` + +### Tests to preserve + +Keep dedicated coverage for: +- every child run setting `PI_SUBAGENTS_CHILD=1` +- non-copilot models not receiving the copilot initiator flag +- the effective model using the resolved model when requested/resolved differ +- best-effort artifact logging never preventing `result.json` from being written +- no registration on empty model lists +- later registration on a non-empty model list +- no re-registration for the same model set in different orders +- re-registration when the model set changes + +### Tests to rewrite + +- `src/tool.test.ts`: generic task payloads, no named agents +- `src/wrapper/render.test.ts`: no `Agent: scout` assertion +- `src/artifacts.test.ts` and `src/process-runner.test.ts`: no role/system-prompt assumptions +- `src/prompts.test.ts`: prompts still ship but no longer rely on named roles +- README/package tests: documentation still accurate after the simplification + +### Tests to remove if modules are removed + +- `src/agents.test.ts` + +## Risks and mitigations + +### Risk: breaking downstream callers + +Mitigation: +- treat the change as explicit API breakage +- update shipped prompts/docs/tests in the same change +- avoid a fake compatibility layer that hides the break + +### Risk: over-editing runner code + +Mitigation: +- keep changes focused on schema, tool execution, prompt handling, and artifact metadata +- preserve shared runner behavior unless role metadata is the only reason a field exists + +### Risk: stale role wording in tests/docs + +Mitigation: +- explicitly scrub `scout`, `planner`, `reviewer`, and `worker` from prompts and user-facing tests +- update transcript/header expectations to generic wording + +## Acceptance criteria + +The design is satisfied when: +- the `subagent` tool no longer requires or accepts agent names +- child runs are launched as normal Pi sessions with task/model/cwd only +- no custom system prompt is injected into child runs +- no agent discovery logic remains in the runtime path +- model validation and registration behavior still passes all existing regressions +- shipped prompts/docs describe generic subagents only +- tests no longer rely on the removed role names or prompt files for behavior