# Subagent Stop-Reason Normalization Design

**Date:** 2026-04-11  
**Package:** `pi-subagents`

## Goal

Fix regression where subagents now exit almost immediately after being called, while preserving the previous fix for lingering child processes that already reached real completion.

## Root cause

`src/wrapper/cli.mjs` currently starts semantic completion cleanup on any normalized assistant event with a `stopReason`.

That assumption is wrong.

Evidence gathered during debugging:

1. `pi` emits `message_end` for assistant messages in general, not only for final overall completion.
2. Pi stop reasons include non-terminal tool-use states.
3. The wrapper currently treats those non-terminal assistant messages as terminal.
4. After the first assistant `message_end`, the wrapper waits ~250 ms and then sends `SIGTERM` / `SIGKILL`.
5. Real subagent runs therefore get killed before their tool work or later assistant turns complete.

A local repro confirmed this with a fake `pi` binary that emitted:

- an early assistant `message_end` with `stopReason: "toolUse"`,
- later tool/final events,
- and no immediate process exit.

The wrapper exited early and wrote `result.json` with the first assistant text instead of the true final result.

## Chosen approach

Use **normalized semantic completion**.

Instead of treating every raw `message_end.stopReason` as terminal, introduce a canonical stop-reason layer and make wrapper completion decisions from that semantic category.

## Scope

### Modify

- `src/wrapper/normalize.mjs`
- `src/wrapper/normalize.test.ts`
- `src/wrapper/cli.mjs`
- `src/wrapper/cli.test.ts`

### Do not modify

- `src/monitor.ts`
- `src/process-runner.ts`
- `src/tmux-runner.ts`
- `src/tool.ts`
- `src/schema.ts`

## Design

### 1. Canonical stop reasons

Normalize raw provider/session stop reasons into canonical values at the wrapper boundary.

Expected canonical values:

- `stop` — normal terminal completion
- `length` — terminal length limit
- `toolUse` — non-terminal assistant turn that is handing off to tools
- `aborted` — terminal abort
- `error` — terminal failure

Representative raw mappings:

- `toolUse`, `tool_use` -> `toolUse`
- `stop`, `end_turn`, `endTurn` -> `stop`
- `length` -> `length`
- `aborted` -> `aborted`
- `error` -> `error`

Unknown values should be preserved conservatively instead of guessed away. The wrapper should not treat an unknown stop reason as definitely terminal unless explicitly mapped that way.

### 2. Preserve raw reason for debugging

Normalized assistant events should keep both:

- canonical `stopReason`
- auxiliary `rawStopReason`

This gives stable internal semantics without losing evidence when debugging provider-specific behavior.

### 3. Terminal vs non-terminal assistant messages

Wrapper semantic completion should start only for canonical **terminal** stop reasons:

- `stop`
- `length`
- `aborted`
- `error`

Wrapper semantic completion must **not** start for:

- `toolUse`
- missing stop reason
- unknown unmapped stop reason

This preserves the previous lingering-child fix while no longer killing subagents during normal tool orchestration.

### 4. Result semantics

`result.json` should use canonical `stopReason` so downstream logic sees stable values.

`rawStopReason` should be included as auxiliary debug information when available.

If a true terminal assistant message was already captured and the child lingers, the wrapper may still force cleanup and write a successful result.

If the most recent assistant message was non-terminal (`toolUse`), forced cleanup must not begin.

### 5. File responsibilities

#### `src/wrapper/normalize.mjs`

Owns translation from raw Pi JSON events into wrapper events with canonical stop-reason semantics.

#### `src/wrapper/cli.mjs`

Owns semantic completion policy:

- consume normalized events,
- track latest terminal assistant result,
- trigger grace/cleanup only for terminal semantic stop reasons,
- preserve best-effort artifact behavior.

#### `src/wrapper/*.test.ts`

Own regression coverage for normalization and wrapper lifecycle behavior.

## Testing strategy

Follow TDD.

### New failing tests

1. **Normalization test**
   - verifies raw stop reasons map to canonical values correctly
   - verifies raw value is preserved for debugging

2. **Early-exit regression test**
   - fake `pi` emits early assistant `message_end` with `toolUse` / `tool_use`
   - later emits tool/final events
   - wrapper must not exit early
   - final `result.json` must reflect the later true terminal completion

### Existing tests to preserve

- lingering final-message child still force-completes
- spawn failure still writes `result.json`
- artifact write failures still do not block `result.json`
- initiator/child environment tests stay green

## Non-goals

- switching the wrapper from JSON mode to RPC mode
- redesigning the monitor contract
- changing runner selection behavior
- broader tool/schema changes

## Expected outcome

After this change:

- subagents will no longer be killed immediately on non-terminal tool-use assistant messages,
- lingering child processes that already reached real terminal completion will still be cleaned up,
- result artifacts will carry stable canonical stop-reason semantics plus raw debugging data.