diff --git a/docs/superpowers/specs/2026-04-11-subagent-stop-reason-normalization-design.md b/docs/superpowers/specs/2026-04-11-subagent-stop-reason-normalization-design.md new file mode 100644 index 0000000..31012a3 --- /dev/null +++ b/docs/superpowers/specs/2026-04-11-subagent-stop-reason-normalization-design.md @@ -0,0 +1,170 @@ +# Subagent Stop-Reason Normalization Design + +**Date:** 2026-04-11 +**Package:** `pi-subagents` + +## Goal + +Fix regression where subagents now exit almost immediately after being called, while preserving the previous fix for lingering child processes that already reached real completion. + +## Root cause + +`src/wrapper/cli.mjs` currently starts semantic completion cleanup on any normalized assistant event with a `stopReason`. + +That assumption is wrong. + +Evidence gathered during debugging: + +1. `pi` emits `message_end` for assistant messages in general, not only for final overall completion. +2. Pi stop reasons include non-terminal tool-use states. +3. The wrapper currently treats those non-terminal assistant messages as terminal. +4. After the first assistant `message_end`, the wrapper waits ~250 ms and then sends `SIGTERM` / `SIGKILL`. +5. Real subagent runs therefore get killed before their tool work or later assistant turns complete. + +A local repro confirmed this with a fake `pi` binary that emitted: + +- an early assistant `message_end` with `stopReason: "toolUse"`, +- later tool/final events, +- and no immediate process exit. + +The wrapper exited early and wrote `result.json` with the first assistant text instead of the true final result. + +## Chosen approach + +Use **normalized semantic completion**. + +Instead of treating every raw `message_end.stopReason` as terminal, introduce a canonical stop-reason layer and make wrapper completion decisions from that semantic category. + +## Scope + +### Modify + +- `src/wrapper/normalize.mjs` +- `src/wrapper/normalize.test.ts` +- `src/wrapper/cli.mjs` +- `src/wrapper/cli.test.ts` + +### Do not modify + +- `src/monitor.ts` +- `src/process-runner.ts` +- `src/tmux-runner.ts` +- `src/tool.ts` +- `src/schema.ts` + +## Design + +### 1. Canonical stop reasons + +Normalize raw provider/session stop reasons into canonical values at the wrapper boundary. + +Expected canonical values: + +- `stop` — normal terminal completion +- `length` — terminal length limit +- `toolUse` — non-terminal assistant turn that is handing off to tools +- `aborted` — terminal abort +- `error` — terminal failure + +Representative raw mappings: + +- `toolUse`, `tool_use` -> `toolUse` +- `stop`, `end_turn`, `endTurn` -> `stop` +- `length` -> `length` +- `aborted` -> `aborted` +- `error` -> `error` + +Unknown values should be preserved conservatively instead of guessed away. The wrapper should not treat an unknown stop reason as definitely terminal unless explicitly mapped that way. + +### 2. Preserve raw reason for debugging + +Normalized assistant events should keep both: + +- canonical `stopReason` +- auxiliary `rawStopReason` + +This gives stable internal semantics without losing evidence when debugging provider-specific behavior. + +### 3. Terminal vs non-terminal assistant messages + +Wrapper semantic completion should start only for canonical **terminal** stop reasons: + +- `stop` +- `length` +- `aborted` +- `error` + +Wrapper semantic completion must **not** start for: + +- `toolUse` +- missing stop reason +- unknown unmapped stop reason + +This preserves the previous lingering-child fix while no longer killing subagents during normal tool orchestration. + +### 4. Result semantics + +`result.json` should use canonical `stopReason` so downstream logic sees stable values. + +`rawStopReason` should be included as auxiliary debug information when available. + +If a true terminal assistant message was already captured and the child lingers, the wrapper may still force cleanup and write a successful result. + +If the most recent assistant message was non-terminal (`toolUse`), forced cleanup must not begin. + +### 5. File responsibilities + +#### `src/wrapper/normalize.mjs` + +Owns translation from raw Pi JSON events into wrapper events with canonical stop-reason semantics. + +#### `src/wrapper/cli.mjs` + +Owns semantic completion policy: + +- consume normalized events, +- track latest terminal assistant result, +- trigger grace/cleanup only for terminal semantic stop reasons, +- preserve best-effort artifact behavior. + +#### `src/wrapper/*.test.ts` + +Own regression coverage for normalization and wrapper lifecycle behavior. + +## Testing strategy + +Follow TDD. + +### New failing tests + +1. **Normalization test** + - verifies raw stop reasons map to canonical values correctly + - verifies raw value is preserved for debugging + +2. **Early-exit regression test** + - fake `pi` emits early assistant `message_end` with `toolUse` / `tool_use` + - later emits tool/final events + - wrapper must not exit early + - final `result.json` must reflect the later true terminal completion + +### Existing tests to preserve + +- lingering final-message child still force-completes +- spawn failure still writes `result.json` +- artifact write failures still do not block `result.json` +- initiator/child environment tests stay green + +## Non-goals + +- switching the wrapper from JSON mode to RPC mode +- redesigning the monitor contract +- changing runner selection behavior +- broader tool/schema changes + +## Expected outcome + +After this change: + +- subagents will no longer be killed immediately on non-terminal tool-use assistant messages, +- lingering child processes that already reached real terminal completion will still be cleaned up, +- result artifacts will carry stable canonical stop-reason semantics plus raw debugging data. \ No newline at end of file