docs: add stop-reason normalization design spec

2026-04-11 01:40:14 +01:00
parent 620fb66358
commit 1d426e82f2
1 changed files with 170 additions and 0 deletions
--- a/docs/superpowers/specs/2026-04-11-subagent-stop-reason-normalization-design.md
+++ b/docs/superpowers/specs/2026-04-11-subagent-stop-reason-normalization-design.md
@@ -0,0 +1,170 @@
+# Subagent Stop-Reason Normalization Design
+
+**Date:** 2026-04-11  
+**Package:** `pi-subagents`
+
+## Goal
+
+Fix regression where subagents now exit almost immediately after being called, while preserving the previous fix for lingering child processes that already reached real completion.
+
+## Root cause
+
+`src/wrapper/cli.mjs` currently starts semantic completion cleanup on any normalized assistant event with a `stopReason`.
+
+That assumption is wrong.
+
+Evidence gathered during debugging:
+
+1. `pi` emits `message_end` for assistant messages in general, not only for final overall completion.
+2. Pi stop reasons include non-terminal tool-use states.
+3. The wrapper currently treats those non-terminal assistant messages as terminal.
+4. After the first assistant `message_end`, the wrapper waits ~250 ms and then sends `SIGTERM` / `SIGKILL`.
+5. Real subagent runs therefore get killed before their tool work or later assistant turns complete.
+
+A local repro confirmed this with a fake `pi` binary that emitted:
+
+- an early assistant `message_end` with `stopReason: "toolUse"`,
+- later tool/final events,
+- and no immediate process exit.
+
+The wrapper exited early and wrote `result.json` with the first assistant text instead of the true final result.
+
+## Chosen approach
+
+Use **normalized semantic completion**.
+
+Instead of treating every raw `message_end.stopReason` as terminal, introduce a canonical stop-reason layer and make wrapper completion decisions from that semantic category.
+
+## Scope
+
+### Modify
+
+- `src/wrapper/normalize.mjs`
+- `src/wrapper/normalize.test.ts`
+- `src/wrapper/cli.mjs`
+- `src/wrapper/cli.test.ts`
+
+### Do not modify
+
+- `src/monitor.ts`
+- `src/process-runner.ts`
+- `src/tmux-runner.ts`
+- `src/tool.ts`
+- `src/schema.ts`
+
+## Design
+
+### 1. Canonical stop reasons
+
+Normalize raw provider/session stop reasons into canonical values at the wrapper boundary.
+
+Expected canonical values:
+
+- `stop` — normal terminal completion
+- `length` — terminal length limit
+- `toolUse` — non-terminal assistant turn that is handing off to tools
+- `aborted` — terminal abort
+- `error` — terminal failure
+
+Representative raw mappings:
+
+- `toolUse`, `tool_use` -> `toolUse`
+- `stop`, `end_turn`, `endTurn` -> `stop`
+- `length` -> `length`
+- `aborted` -> `aborted`
+- `error` -> `error`
+
+Unknown values should be preserved conservatively instead of guessed away. The wrapper should not treat an unknown stop reason as definitely terminal unless explicitly mapped that way.
+
+### 2. Preserve raw reason for debugging
+
+Normalized assistant events should keep both:
+
+- canonical `stopReason`
+- auxiliary `rawStopReason`
+
+This gives stable internal semantics without losing evidence when debugging provider-specific behavior.
+
+### 3. Terminal vs non-terminal assistant messages
+
+Wrapper semantic completion should start only for canonical **terminal** stop reasons:
+
+- `stop`
+- `length`
+- `aborted`
+- `error`
+
+Wrapper semantic completion must **not** start for:
+
+- `toolUse`
+- missing stop reason
+- unknown unmapped stop reason
+
+This preserves the previous lingering-child fix while no longer killing subagents during normal tool orchestration.
+
+### 4. Result semantics
+
+`result.json` should use canonical `stopReason` so downstream logic sees stable values.
+
+`rawStopReason` should be included as auxiliary debug information when available.
+
+If a true terminal assistant message was already captured and the child lingers, the wrapper may still force cleanup and write a successful result.
+
+If the most recent assistant message was non-terminal (`toolUse`), forced cleanup must not begin.
+
+### 5. File responsibilities
+
+#### `src/wrapper/normalize.mjs`
+
+Owns translation from raw Pi JSON events into wrapper events with canonical stop-reason semantics.
+
+#### `src/wrapper/cli.mjs`
+
+Owns semantic completion policy:
+
+- consume normalized events,
+- track latest terminal assistant result,
+- trigger grace/cleanup only for terminal semantic stop reasons,
+- preserve best-effort artifact behavior.
+
+#### `src/wrapper/*.test.ts`
+
+Own regression coverage for normalization and wrapper lifecycle behavior.
+
+## Testing strategy
+
+Follow TDD.
+
+### New failing tests
+
+1. **Normalization test**
+   - verifies raw stop reasons map to canonical values correctly
+   - verifies raw value is preserved for debugging
+
+2. **Early-exit regression test**
+   - fake `pi` emits early assistant `message_end` with `toolUse` / `tool_use`
+   - later emits tool/final events
+   - wrapper must not exit early
+   - final `result.json` must reflect the later true terminal completion
+
+### Existing tests to preserve
+
+- lingering final-message child still force-completes
+- spawn failure still writes `result.json`
+- artifact write failures still do not block `result.json`
+- initiator/child environment tests stay green
+
+## Non-goals
+
+- switching the wrapper from JSON mode to RPC mode
+- redesigning the monitor contract
+- changing runner selection behavior
+- broader tool/schema changes
+
+## Expected outcome
+
+After this change:
+
+- subagents will no longer be killed immediately on non-terminal tool-use assistant messages,
+- lingering child processes that already reached real terminal completion will still be cleaned up,
+- result artifacts will carry stable canonical stop-reason semantics plus raw debugging data.