Files
pi-subagents/docs/superpowers/specs/2026-04-12-generic-subagents-design.md

287 lines
8.6 KiB
Markdown

# Generic subagents design
Date: 2026-04-12
Status: approved for planning
## Summary
Simplify `pi-subagents` from named, specialized child agents into a single generic `subagent` concept.
A subagent run should be a normal Pi child session that receives:
- a delegated `task`
- a selected `model`
- an optional `cwd`
It should **not** receive a role-specific system prompt, role-specific tool restrictions, built-in role behavior, or markdown-discovered agent definitions.
This is an intentional breaking API cleanup, not a compatibility shim.
## Current state
Today the package supports:
- built-in named roles in `src/builtin-agents.ts` (`scout`, `planner`, `reviewer`, `worker`)
- optional markdown-discovered user/project agents in `src/agents.ts`
- role-derived `systemPrompt`, `tools`, and optional default `model`
- prompt templates in `prompts/` that explicitly chain those named roles
That surface no longer matches the desired product behavior.
## Goals
- Make every child run a generic Pi agent.
- Remove named/specialized agent concepts from the package API and runtime.
- Preserve the existing runner, wrapper, model-registration, and artifact mechanics where they are still valid.
- Keep parallel and chain execution modes.
- Keep explicit model selection and validation.
## Non-goals
- No compatibility layer that silently accepts old role names.
- No replacement specialization layer.
- No change to process-vs-tmux runner selection.
- No change to package identity or env names (`PI_SUBAGENTS_*`).
## Chosen approach
Use a hard simplification:
- remove named agent definitions entirely
- remove markdown agent discovery entirely
- make child runs task/model/cwd only
- rewrite workflow prompts so they describe generic multi-step delegation rather than named roles
This matches the intended behavior exactly and removes misleading concepts instead of preserving dead abstractions.
## Public API design
### Tool modes
#### Single mode
Required:
- `task`
- `model`
Optional:
- `cwd`
#### Parallel mode
Required:
- top-level `model`
- `tasks: Array<{ task: string; model?: string; cwd?: string }>`
The top-level `model` remains the default for tasks that do not specify `model`.
#### Chain mode
Required:
- top-level `model`
- `chain: Array<{ task: string; model?: string; cwd?: string }>`
`{previous}` substitution remains supported.
### Removed fields
Remove from the tool schema:
- `agent`
- `tasks[].agent`
- `chain[].agent`
- `agentScope`
- `confirmProjectAgents`
### Model fields
Keep:
- required top-level `model`
- optional per-task/per-step `model`
- optional per-task/per-step `cwd`
Keep the current schema descriptions that explain models must come from the currently available models.
## Runtime design
### Tool execution
`src/tool.ts` should stop discovering agents entirely.
Instead, each run is built directly from the requested task item:
- resolve the effective model from `task.model ?? params.model`
- pass generic run metadata to the configured runner
- do not attach `systemPrompt`
- do not attach `tools`
Parallel and chain execution semantics stay the same apart from the simplified payload shape.
### Child wrapper
`src/wrapper/cli.mjs` remains a generic Pi launcher.
Keep:
- `pi --mode json --session ... --model ... <task>`
- effective model selection based on resolved model when present
- `PI_SUBAGENTS_CHILD=1` on every child run
- GitHub Copilot initiator behavior only for effective models under `github-copilot/*`
- best-effort artifact appends that must never block writing `result.json`
Remove:
- `--append-system-prompt`
- any dependence on agent-role metadata
### Runner behavior
Keep process runner and tmux runner behavior unchanged except for the smaller metadata payload.
No change to:
- process runner as default
- tmux as opt-in
- tmux requiring `tmux` on `PATH` only when the tmux runner is selected
- result monitoring and artifact paths
## Metadata and artifact design
Keep stable run bookkeeping fields such as:
- `runId`
- `mode`
- `taskIndex`
- `step`
- `task`
- `cwd`
- `requestedModel`
- `resolvedModel`
- `sessionPath`
- artifact paths
- exit status and stop-reason fields
Remove role-derived fields from runtime payloads and outputs:
- `agent`
- `agentSource`
- `systemPrompt`
- `systemPromptPath`
- any role-only transcript rendering
Transcript headers should stay generic and continue to show useful run metadata, but should no longer print `Agent: ...`.
Artifact directory structure should stay stable under `.pi/subagents/runs/<runId>/` unless a removed role-specific file is no longer needed.
## Extension registration behavior
Keep the current model-registration behavior exactly:
- do not register the tool when no models are available
- preserve original available-model order for schema enums
- dedupe registration using an order-insensitive lowercase-sorted copy for the cache key
- re-register when the model set changes
- do not re-register when the model set is the same in a different order
- if the first observed set is empty, a later non-empty set must still register
Keep the child-session guard in `index.ts`:
- provider override may still register first
- subagent tool registration must still be skipped when `PI_SUBAGENTS_CHILD=1`
## Prompt and documentation design
Keep the shipped workflow prompt templates, but rewrite them as generic orchestration helpers.
They must no longer reference:
- `scout`
- `planner`
- `reviewer`
- `worker`
Instead, they should instruct the parent agent to run generic subagents whose tasks describe the intended step, for example:
- inspect
- plan
- implement
- review
- revise
README and package-facing docs should describe:
- generic child Pi sessions
- the unchanged runner options
- the simplified subagent concept
They should stop describing specialized built-in roles or markdown agent discovery.
## File-level impact
Expected primary runtime changes:
- modify `src/schema.ts`
- modify `src/tool.ts`
- modify `src/artifacts.ts`
- modify `src/process-runner.ts`
- modify `src/wrapper/cli.mjs`
- modify `src/wrapper/render.mjs`
- modify `README.md`
- modify `prompts/*.md`
Expected removals if implementation chooses the full cleanup path:
- `src/builtin-agents.ts`
- `src/agents.ts`
- agent-discovery-specific tests
`index.ts`, `src/process-runner.ts`, and `src/tmux-runner.ts` should otherwise keep the smallest possible runner-specific changes.
## Testing plan
Implementation should follow TDD:
1. write or update the failing tests first
2. verify the failures are for the intended reason
3. implement the minimal runtime changes
4. re-run targeted tests
5. run full `npm test`
### Tests to preserve
Keep dedicated coverage for:
- every child run setting `PI_SUBAGENTS_CHILD=1`
- non-copilot models not receiving the copilot initiator flag
- the effective model using the resolved model when requested/resolved differ
- best-effort artifact logging never preventing `result.json` from being written
- no registration on empty model lists
- later registration on a non-empty model list
- no re-registration for the same model set in different orders
- re-registration when the model set changes
### Tests to rewrite
- `src/tool.test.ts`: generic task payloads, no named agents
- `src/wrapper/render.test.ts`: no `Agent: scout` assertion
- `src/artifacts.test.ts` and `src/process-runner.test.ts`: no role/system-prompt assumptions
- `src/prompts.test.ts`: prompts still ship but no longer rely on named roles
- README/package tests: documentation still accurate after the simplification
### Tests to remove if modules are removed
- `src/agents.test.ts`
## Risks and mitigations
### Risk: breaking downstream callers
Mitigation:
- treat the change as explicit API breakage
- update shipped prompts/docs/tests in the same change
- avoid a fake compatibility layer that hides the break
### Risk: over-editing runner code
Mitigation:
- keep changes focused on schema, tool execution, prompt handling, and artifact metadata
- preserve shared runner behavior unless role metadata is the only reason a field exists
### Risk: stale role wording in tests/docs
Mitigation:
- explicitly scrub `scout`, `planner`, `reviewer`, and `worker` from prompts and user-facing tests
- update transcript/header expectations to generic wording
## Acceptance criteria
The design is satisfied when:
- the `subagent` tool no longer requires or accepts agent names
- child runs are launched as normal Pi sessions with task/model/cwd only
- no custom system prompt is injected into child runs
- no agent discovery logic remains in the runtime path
- model validation and registration behavior still passes all existing regressions
- shipped prompts/docs describe generic subagents only
- tests no longer rely on the removed role names or prompt files for behavior