docs: add generic subagents design spec
This commit is contained in:
286
docs/superpowers/specs/2026-04-12-generic-subagents-design.md
Normal file
286
docs/superpowers/specs/2026-04-12-generic-subagents-design.md
Normal file
@@ -0,0 +1,286 @@
|
||||
# Generic subagents design
|
||||
|
||||
Date: 2026-04-12
|
||||
Status: approved for planning
|
||||
|
||||
## Summary
|
||||
|
||||
Simplify `pi-subagents` from named, specialized child agents into a single generic `subagent` concept.
|
||||
|
||||
A subagent run should be a normal Pi child session that receives:
|
||||
- a delegated `task`
|
||||
- a selected `model`
|
||||
- an optional `cwd`
|
||||
|
||||
It should **not** receive a role-specific system prompt, role-specific tool restrictions, built-in role behavior, or markdown-discovered agent definitions.
|
||||
|
||||
This is an intentional breaking API cleanup, not a compatibility shim.
|
||||
|
||||
## Current state
|
||||
|
||||
Today the package supports:
|
||||
- built-in named roles in `src/builtin-agents.ts` (`scout`, `planner`, `reviewer`, `worker`)
|
||||
- optional markdown-discovered user/project agents in `src/agents.ts`
|
||||
- role-derived `systemPrompt`, `tools`, and optional default `model`
|
||||
- prompt templates in `prompts/` that explicitly chain those named roles
|
||||
|
||||
That surface no longer matches the desired product behavior.
|
||||
|
||||
## Goals
|
||||
|
||||
- Make every child run a generic Pi agent.
|
||||
- Remove named/specialized agent concepts from the package API and runtime.
|
||||
- Preserve the existing runner, wrapper, model-registration, and artifact mechanics where they are still valid.
|
||||
- Keep parallel and chain execution modes.
|
||||
- Keep explicit model selection and validation.
|
||||
|
||||
## Non-goals
|
||||
|
||||
- No compatibility layer that silently accepts old role names.
|
||||
- No replacement specialization layer.
|
||||
- No change to process-vs-tmux runner selection.
|
||||
- No change to package identity or env names (`PI_SUBAGENTS_*`).
|
||||
|
||||
## Chosen approach
|
||||
|
||||
Use a hard simplification:
|
||||
- remove named agent definitions entirely
|
||||
- remove markdown agent discovery entirely
|
||||
- make child runs task/model/cwd only
|
||||
- rewrite workflow prompts so they describe generic multi-step delegation rather than named roles
|
||||
|
||||
This matches the intended behavior exactly and removes misleading concepts instead of preserving dead abstractions.
|
||||
|
||||
## Public API design
|
||||
|
||||
### Tool modes
|
||||
|
||||
#### Single mode
|
||||
|
||||
Required:
|
||||
- `task`
|
||||
- `model`
|
||||
|
||||
Optional:
|
||||
- `cwd`
|
||||
|
||||
#### Parallel mode
|
||||
|
||||
Required:
|
||||
- top-level `model`
|
||||
- `tasks: Array<{ task: string; model?: string; cwd?: string }>`
|
||||
|
||||
The top-level `model` remains the default for tasks that do not specify `model`.
|
||||
|
||||
#### Chain mode
|
||||
|
||||
Required:
|
||||
- top-level `model`
|
||||
- `chain: Array<{ task: string; model?: string; cwd?: string }>`
|
||||
|
||||
`{previous}` substitution remains supported.
|
||||
|
||||
### Removed fields
|
||||
|
||||
Remove from the tool schema:
|
||||
- `agent`
|
||||
- `tasks[].agent`
|
||||
- `chain[].agent`
|
||||
- `agentScope`
|
||||
- `confirmProjectAgents`
|
||||
|
||||
### Model fields
|
||||
|
||||
Keep:
|
||||
- required top-level `model`
|
||||
- optional per-task/per-step `model`
|
||||
- optional per-task/per-step `cwd`
|
||||
|
||||
Keep the current schema descriptions that explain models must come from the currently available models.
|
||||
|
||||
## Runtime design
|
||||
|
||||
### Tool execution
|
||||
|
||||
`src/tool.ts` should stop discovering agents entirely.
|
||||
|
||||
Instead, each run is built directly from the requested task item:
|
||||
- resolve the effective model from `task.model ?? params.model`
|
||||
- pass generic run metadata to the configured runner
|
||||
- do not attach `systemPrompt`
|
||||
- do not attach `tools`
|
||||
|
||||
Parallel and chain execution semantics stay the same apart from the simplified payload shape.
|
||||
|
||||
### Child wrapper
|
||||
|
||||
`src/wrapper/cli.mjs` remains a generic Pi launcher.
|
||||
|
||||
Keep:
|
||||
- `pi --mode json --session ... --model ... <task>`
|
||||
- effective model selection based on resolved model when present
|
||||
- `PI_SUBAGENTS_CHILD=1` on every child run
|
||||
- GitHub Copilot initiator behavior only for effective models under `github-copilot/*`
|
||||
- best-effort artifact appends that must never block writing `result.json`
|
||||
|
||||
Remove:
|
||||
- `--append-system-prompt`
|
||||
- any dependence on agent-role metadata
|
||||
|
||||
### Runner behavior
|
||||
|
||||
Keep process runner and tmux runner behavior unchanged except for the smaller metadata payload.
|
||||
|
||||
No change to:
|
||||
- process runner as default
|
||||
- tmux as opt-in
|
||||
- tmux requiring `tmux` on `PATH` only when the tmux runner is selected
|
||||
- result monitoring and artifact paths
|
||||
|
||||
## Metadata and artifact design
|
||||
|
||||
Keep stable run bookkeeping fields such as:
|
||||
- `runId`
|
||||
- `mode`
|
||||
- `taskIndex`
|
||||
- `step`
|
||||
- `task`
|
||||
- `cwd`
|
||||
- `requestedModel`
|
||||
- `resolvedModel`
|
||||
- `sessionPath`
|
||||
- artifact paths
|
||||
- exit status and stop-reason fields
|
||||
|
||||
Remove role-derived fields from runtime payloads and outputs:
|
||||
- `agent`
|
||||
- `agentSource`
|
||||
- `systemPrompt`
|
||||
- `systemPromptPath`
|
||||
- any role-only transcript rendering
|
||||
|
||||
Transcript headers should stay generic and continue to show useful run metadata, but should no longer print `Agent: ...`.
|
||||
|
||||
Artifact directory structure should stay stable under `.pi/subagents/runs/<runId>/` unless a removed role-specific file is no longer needed.
|
||||
|
||||
## Extension registration behavior
|
||||
|
||||
Keep the current model-registration behavior exactly:
|
||||
- do not register the tool when no models are available
|
||||
- preserve original available-model order for schema enums
|
||||
- dedupe registration using an order-insensitive lowercase-sorted copy for the cache key
|
||||
- re-register when the model set changes
|
||||
- do not re-register when the model set is the same in a different order
|
||||
- if the first observed set is empty, a later non-empty set must still register
|
||||
|
||||
Keep the child-session guard in `index.ts`:
|
||||
- provider override may still register first
|
||||
- subagent tool registration must still be skipped when `PI_SUBAGENTS_CHILD=1`
|
||||
|
||||
## Prompt and documentation design
|
||||
|
||||
Keep the shipped workflow prompt templates, but rewrite them as generic orchestration helpers.
|
||||
|
||||
They must no longer reference:
|
||||
- `scout`
|
||||
- `planner`
|
||||
- `reviewer`
|
||||
- `worker`
|
||||
|
||||
Instead, they should instruct the parent agent to run generic subagents whose tasks describe the intended step, for example:
|
||||
- inspect
|
||||
- plan
|
||||
- implement
|
||||
- review
|
||||
- revise
|
||||
|
||||
README and package-facing docs should describe:
|
||||
- generic child Pi sessions
|
||||
- the unchanged runner options
|
||||
- the simplified subagent concept
|
||||
|
||||
They should stop describing specialized built-in roles or markdown agent discovery.
|
||||
|
||||
## File-level impact
|
||||
|
||||
Expected primary runtime changes:
|
||||
- modify `src/schema.ts`
|
||||
- modify `src/tool.ts`
|
||||
- modify `src/artifacts.ts`
|
||||
- modify `src/process-runner.ts`
|
||||
- modify `src/wrapper/cli.mjs`
|
||||
- modify `src/wrapper/render.mjs`
|
||||
- modify `README.md`
|
||||
- modify `prompts/*.md`
|
||||
|
||||
Expected removals if implementation chooses the full cleanup path:
|
||||
- `src/builtin-agents.ts`
|
||||
- `src/agents.ts`
|
||||
- agent-discovery-specific tests
|
||||
|
||||
`index.ts`, `src/process-runner.ts`, and `src/tmux-runner.ts` should otherwise keep the smallest possible runner-specific changes.
|
||||
|
||||
## Testing plan
|
||||
|
||||
Implementation should follow TDD:
|
||||
1. write or update the failing tests first
|
||||
2. verify the failures are for the intended reason
|
||||
3. implement the minimal runtime changes
|
||||
4. re-run targeted tests
|
||||
5. run full `npm test`
|
||||
|
||||
### Tests to preserve
|
||||
|
||||
Keep dedicated coverage for:
|
||||
- every child run setting `PI_SUBAGENTS_CHILD=1`
|
||||
- non-copilot models not receiving the copilot initiator flag
|
||||
- the effective model using the resolved model when requested/resolved differ
|
||||
- best-effort artifact logging never preventing `result.json` from being written
|
||||
- no registration on empty model lists
|
||||
- later registration on a non-empty model list
|
||||
- no re-registration for the same model set in different orders
|
||||
- re-registration when the model set changes
|
||||
|
||||
### Tests to rewrite
|
||||
|
||||
- `src/tool.test.ts`: generic task payloads, no named agents
|
||||
- `src/wrapper/render.test.ts`: no `Agent: scout` assertion
|
||||
- `src/artifacts.test.ts` and `src/process-runner.test.ts`: no role/system-prompt assumptions
|
||||
- `src/prompts.test.ts`: prompts still ship but no longer rely on named roles
|
||||
- README/package tests: documentation still accurate after the simplification
|
||||
|
||||
### Tests to remove if modules are removed
|
||||
|
||||
- `src/agents.test.ts`
|
||||
|
||||
## Risks and mitigations
|
||||
|
||||
### Risk: breaking downstream callers
|
||||
|
||||
Mitigation:
|
||||
- treat the change as explicit API breakage
|
||||
- update shipped prompts/docs/tests in the same change
|
||||
- avoid a fake compatibility layer that hides the break
|
||||
|
||||
### Risk: over-editing runner code
|
||||
|
||||
Mitigation:
|
||||
- keep changes focused on schema, tool execution, prompt handling, and artifact metadata
|
||||
- preserve shared runner behavior unless role metadata is the only reason a field exists
|
||||
|
||||
### Risk: stale role wording in tests/docs
|
||||
|
||||
Mitigation:
|
||||
- explicitly scrub `scout`, `planner`, `reviewer`, and `worker` from prompts and user-facing tests
|
||||
- update transcript/header expectations to generic wording
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
The design is satisfied when:
|
||||
- the `subagent` tool no longer requires or accepts agent names
|
||||
- child runs are launched as normal Pi sessions with task/model/cwd only
|
||||
- no custom system prompt is injected into child runs
|
||||
- no agent discovery logic remains in the runtime path
|
||||
- model validation and registration behavior still passes all existing regressions
|
||||
- shipped prompts/docs describe generic subagents only
|
||||
- tests no longer rely on the removed role names or prompt files for behavior
|
||||
Reference in New Issue
Block a user