diff --git a/AGENTS.md b/AGENTS.md index d59bfb8..85ca368 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -21,6 +21,7 @@ Applies to entire `pi-subagents` package in this repo root. - `src/tool.ts` / `src/schema.ts` — tool contract and parameter/result types. - `src/wrapper/cli.mjs` — child-session wrapper that writes artifacts/results. - `src/*.test.ts`, `src/wrapper/*.test.ts` — regression tests. +- `docs/superpowers/specs/` — design docs created during brainstorming before implementation. - `docs/superpowers/plans/` — implementation plans and migration notes. ## Working rules diff --git a/docs/superpowers/specs/2026-04-11-subagent-wrapper-semantic-completion-design.md b/docs/superpowers/specs/2026-04-11-subagent-wrapper-semantic-completion-design.md new file mode 100644 index 0000000..4dba3b9 --- /dev/null +++ b/docs/superpowers/specs/2026-04-11-subagent-wrapper-semantic-completion-design.md @@ -0,0 +1,138 @@ +# Subagent Wrapper Semantic Completion Design + +**Date:** 2026-04-11 +**Package:** `pi-subagents` + +## Goal + +Fix bug where a subagent has already produced its final assistant output, but the parent tool never reports completion and the main agent stays stuck waiting. + +## Root cause + +Current completion depends on `result.json` being written by `src/wrapper/cli.mjs`. + +Shared failure mode across both runners: + +1. Child `pi` process emits a terminal assistant event (`message_end` → normalized `assistant_text` with `stopReason`). +2. Wrapper records the final text in `events.jsonl` / transcript output. +3. Wrapper still waits for the child process to exit before writing `result.json`. +4. If the child process lingers after terminal output, `result.json` is never written. +5. `src/monitor.ts` waits forever for `result.json`, so the tool never finishes. + +This is shared across process and tmux runners because both use the same wrapper and file-based monitor contract. + +## Chosen approach + +Implement **wrapper-level semantic completion** in `src/wrapper/cli.mjs`. + +When the wrapper observes a terminal normalized `assistant_text` event: + +- treat that event as semantic proof that the subagent conversation is complete, +- allow a short grace window for natural child-process exit, +- if the child still has not exited after the grace window, terminate it, +- preserve the captured final text / model / stop reason, +- write `result.json` and let the parent tool complete. + +## Scope + +### Modify + +- `src/wrapper/cli.mjs` +- `src/wrapper/cli.test.ts` +- `AGENTS.md` — document `docs/superpowers/specs/` + +### Do not modify + +- `src/tmux.ts` — keep tmux helpers only +- tool/schema registration behavior +- runner selection/config behavior +- model resolution behavior +- artifact file layout + +## Detailed behavior + +### 1. Completion trigger + +Terminal completion is detected only from the wrapper’s normalized event stream: + +- event type: `assistant_text` +- terminal signal: `stopReason` present + +This keeps the rule aligned with the existing wrapper event model instead of adding a second completion protocol. + +### 2. Grace window + +After terminal assistant output is observed, wait a short internal grace period (target: about 250 ms) for the child `pi` process to exit on its own. + +Reasoning: + +- normal runs should still exit naturally, +- short lag after final output is acceptable, +- lingering processes should not block the parent forever. + +### 3. Forced cleanup + +If the child is still alive after the grace window: + +1. send `SIGTERM`, +2. wait a second short cleanup window, +3. escalate to `SIGKILL` only if still required. + +This cleanup is wrapper-internal and applies equally regardless of whether the top-level runner is process or tmux. + +### 4. Result semantics + +If terminal assistant output was already captured before cleanup: + +- semantic outcome comes from the captured terminal event, not from the cleanup signal, +- `finalText` comes from captured assistant output, +- `resolvedModel` stays based on captured model/event data, +- `stopReason` stays the terminal assistant stop reason, +- forced cleanup after semantic completion must **not** downgrade a normal completion into an error. + +This means: + +- normal terminal completion stays successful even if cleanup was needed afterward, +- terminal error/aborted outcomes stay error/aborted instead of being rewritten. + +If terminal assistant output was **not** captured, existing error behavior remains unchanged. + +### 5. Existing guarantees to preserve + +- best-effort artifact appends must still never block final `result.json` writing, +- child runs must still set `PI_SUBAGENTS_CHILD=1`, +- github-copilot initiator behavior must stay unchanged, +- no tmux-specific logic should leak into non-tmux helper files. + +## Testing strategy + +Follow TDD. + +### First failing regression test + +Add a wrapper regression test that: + +1. creates a fake `pi` executable, +2. emits one terminal `message_end` JSON event with final text, +3. intentionally stays alive instead of exiting, +4. verifies the wrapper exits promptly, +5. verifies `result.json` is still written, +6. verifies a normal terminal completion still produces a successful result with preserved final text. + +### Verification after implementation + +Run at least: + +- targeted wrapper test file +- full package test suite (`npm test`) + +## Non-goals + +- redesigning `src/monitor.ts` +- changing runner contracts or schema types +- adding user-configurable timeout settings +- broad refactors outside the wrapper regression fix + +## Expected outcome + +After the change, once a subagent has emitted its terminal assistant output, the parent tool will complete even if the spawned child `pi` process lingers instead of exiting promptly. This removes the stuck-tool behavior for both runner modes while keeping the fix on the smallest shared surface. \ No newline at end of file