Files
pi-subagents/docs/superpowers/specs/2026-04-11-subagent-wrapper-semantic-completion-design.md

138 lines
4.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Subagent Wrapper Semantic Completion Design
**Date:** 2026-04-11
**Package:** `pi-subagents`
## Goal
Fix bug where a subagent has already produced its final assistant output, but the parent tool never reports completion and the main agent stays stuck waiting.
## Root cause
Current completion depends on `result.json` being written by `src/wrapper/cli.mjs`.
Shared failure mode across both runners:
1. Child `pi` process emits a terminal assistant event (`message_end` → normalized `assistant_text` with `stopReason`).
2. Wrapper records the final text in `events.jsonl` / transcript output.
3. Wrapper still waits for the child process to exit before writing `result.json`.
4. If the child process lingers after terminal output, `result.json` is never written.
5. `src/monitor.ts` waits forever for `result.json`, so the tool never finishes.
This is shared across process and tmux runners because both use the same wrapper and file-based monitor contract.
## Chosen approach
Implement **wrapper-level semantic completion** in `src/wrapper/cli.mjs`.
When the wrapper observes a terminal normalized `assistant_text` event:
- treat that event as semantic proof that the subagent conversation is complete,
- allow a short grace window for natural child-process exit,
- if the child still has not exited after the grace window, terminate it,
- preserve the captured final text / model / stop reason,
- write `result.json` and let the parent tool complete.
## Scope
### Modify
- `src/wrapper/cli.mjs`
- `src/wrapper/cli.test.ts`
- `AGENTS.md` — document `docs/superpowers/specs/`
### Do not modify
- `src/tmux.ts` — keep tmux helpers only
- tool/schema registration behavior
- runner selection/config behavior
- model resolution behavior
- artifact file layout
## Detailed behavior
### 1. Completion trigger
Terminal completion is detected only from the wrappers normalized event stream:
- event type: `assistant_text`
- terminal signal: `stopReason` present
This keeps the rule aligned with the existing wrapper event model instead of adding a second completion protocol.
### 2. Grace window
After terminal assistant output is observed, wait a short internal grace period (target: about 250 ms) for the child `pi` process to exit on its own.
Reasoning:
- normal runs should still exit naturally,
- short lag after final output is acceptable,
- lingering processes should not block the parent forever.
### 3. Forced cleanup
If the child is still alive after the grace window:
1. send `SIGTERM`,
2. wait a second short cleanup window,
3. escalate to `SIGKILL` only if still required.
This cleanup is wrapper-internal and applies equally regardless of whether the top-level runner is process or tmux.
### 4. Result semantics
If terminal assistant output was already captured before cleanup:
- semantic outcome comes from the captured terminal event, not from the cleanup signal,
- `finalText` comes from captured assistant output,
- `resolvedModel` stays based on captured model/event data,
- `stopReason` stays the terminal assistant stop reason,
- forced cleanup after semantic completion must **not** downgrade a normal completion into an error.
This means:
- normal terminal completion stays successful even if cleanup was needed afterward,
- terminal error/aborted outcomes stay error/aborted instead of being rewritten.
If terminal assistant output was **not** captured, existing error behavior remains unchanged.
### 5. Existing guarantees to preserve
- best-effort artifact appends must still never block final `result.json` writing,
- child runs must still set `PI_SUBAGENTS_CHILD=1`,
- github-copilot initiator behavior must stay unchanged,
- no tmux-specific logic should leak into non-tmux helper files.
## Testing strategy
Follow TDD.
### First failing regression test
Add a wrapper regression test that:
1. creates a fake `pi` executable,
2. emits one terminal `message_end` JSON event with final text,
3. intentionally stays alive instead of exiting,
4. verifies the wrapper exits promptly,
5. verifies `result.json` is still written,
6. verifies a normal terminal completion still produces a successful result with preserved final text.
### Verification after implementation
Run at least:
- targeted wrapper test file
- full package test suite (`npm test`)
## Non-goals
- redesigning `src/monitor.ts`
- changing runner contracts or schema types
- adding user-configurable timeout settings
- broad refactors outside the wrapper regression fix
## Expected outcome
After the change, once a subagent has emitted its terminal assistant output, the parent tool will complete even if the spawned child `pi` process lingers instead of exiting promptly. This removes the stuck-tool behavior for both runner modes while keeping the fix on the smallest shared surface.