diff --git a/docs/superpowers/plans/2026-04-12-subagent-presets-and-background-agents.md b/docs/superpowers/plans/2026-04-12-subagent-presets-and-background-agents.md new file mode 100644 index 0000000..990e28f --- /dev/null +++ b/docs/superpowers/plans/2026-04-12-subagent-presets-and-background-agents.md @@ -0,0 +1,1298 @@ +# Subagent Presets and Background Agents Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Add markdown-discovered subagent presets, remove `chain`, add `background_agent` plus `background_agent_status`, and restore preset-owned prompt/tool wiring while preserving the existing runner and artifact guarantees. + +**Architecture:** Keep the current foreground runner/wrapper pipeline, but insert preset discovery before each run so foreground and background launches share one metadata shape. Add a small background-run registry in the extension layer that persists launch/update entries to the session, rebuilds state on reload, watches detached process results, updates footer counts, and emits completion notifications without auto-triggering a new turn. + +**Tech Stack:** TypeScript, Node.js, `node:test`, `@sinclair/typebox`, `@mariozechner/pi-coding-agent`, wrapper scripts in `src/wrapper/` + +--- + +## File structure and responsibilities + +- `src/presets.ts` — discover and parse named preset markdown files from global and project directories +- `src/presets.test.ts` — preset discovery and override behavior +- `src/schema.ts` — `subagent` schema; remove `chain`, require `preset` +- `src/background-schema.ts` — schemas and types for `background_agent` and `background_agent_status` +- `src/models.ts` — resolve effective model from call override or preset default +- `src/tool.ts` — preset-aware foreground `subagent` execution for single and parallel modes +- `src/background-registry.ts` — background run state, counts, persistence payload shapes, reload reconstruction helpers +- `src/background-registry.test.ts` — registry counts, replay/rebuild, completion transitions +- `src/background-tool.ts` — detached process launch, registry registration, immediate handle response +- `src/background-tool.test.ts` — detached launch behavior and return shape +- `src/background-status-tool.ts` — poll one/all background runs and format counts +- `src/background-status-tool.test.ts` — polling behavior and filtering +- `src/process-runner.ts` — shared process launch primitive for waited vs detached runs +- `src/artifacts.ts` — restore `system-prompt.md` and serialize preset metadata into `meta.json` +- `src/wrapper/cli.mjs` — restore `--append-system-prompt` and `--tools` support +- `index.ts` — register all tools, bootstrap registry, rebuild watchers, footer updates, notifications +- `README.md` / `prompts/*.md` / `AGENTS.md` — package-facing docs and workflow prompt updates + +--- + +### Task 1: Add preset discovery and effective-model resolution + +**Files:** +- Create: `src/presets.ts` +- Create: `src/presets.test.ts` +- Modify: `src/models.ts` +- Modify: `src/models.test.ts` +- Test: `src/presets.test.ts` +- Test: `src/models.test.ts` + +- [ ] **Step 1: Write the failing preset-discovery tests** + +`src/presets.test.ts` + +```ts +import test from "node:test"; +import assert from "node:assert/strict"; +import { mkdir, mkdtemp, writeFile } from "node:fs/promises"; +import { join } from "node:path"; +import { tmpdir } from "node:os"; +import { discoverSubagentPresets } from "./presets.ts"; + +test("discoverSubagentPresets loads global presets and lets nearest project presets override by name", async () => { + const root = await mkdtemp(join(tmpdir(), "pi-subagents-presets-")); + const homeDir = join(root, "home"); + const repo = join(root, "repo", "apps", "web"); + const globalDir = join(homeDir, ".pi", "agent", "subagents"); + const projectDir = join(root, "repo", ".pi", "subagents"); + + await mkdir(globalDir, { recursive: true }); + await mkdir(projectDir, { recursive: true }); + await mkdir(repo, { recursive: true }); + + await writeFile( + join(globalDir, "reviewer.md"), + `---\nname: reviewer\ndescription: Global reviewer\nmodel: openai/gpt-5\ntools: read,grep\n---\nGlobal prompt`, + "utf8", + ); + await writeFile( + join(projectDir, "reviewer.md"), + `---\nname: reviewer\ndescription: Project reviewer\nmodel: anthropic/claude-sonnet-4-5\n---\nProject prompt`, + "utf8", + ); + + const result = discoverSubagentPresets(repo, { homeDir }); + const reviewer = result.presets.find((preset) => preset.name === "reviewer"); + + assert.equal(reviewer?.source, "project"); + assert.equal(reviewer?.description, "Project reviewer"); + assert.equal(reviewer?.model, "anthropic/claude-sonnet-4-5"); + assert.equal(reviewer?.tools, undefined); + assert.equal(reviewer?.systemPrompt, "Project prompt"); + assert.match(result.projectPresetsDir ?? "", /\.pi\/subagents$/); +}); + +test("discoverSubagentPresets ignores markdown files without required frontmatter", async () => { + const root = await mkdtemp(join(tmpdir(), "pi-subagents-presets-")); + const homeDir = join(root, "home"); + const repo = join(root, "repo"); + const globalDir = join(homeDir, ".pi", "agent", "subagents"); + + await mkdir(globalDir, { recursive: true }); + await mkdir(repo, { recursive: true }); + await writeFile(join(globalDir, "broken.md"), `---\nname: broken\n---\nMissing description`, "utf8"); + + const result = discoverSubagentPresets(repo, { homeDir }); + assert.deepEqual(result.presets, []); +}); +``` + +`src/models.test.ts` + +```ts +test("resolveChildModel prefers explicit call model over preset default model", () => { + const selection = resolveChildModel({ + callModel: "openai/gpt-5", + presetModel: "anthropic/claude-sonnet-4-5", + }); + + assert.equal(selection.requestedModel, "openai/gpt-5"); + assert.equal(selection.resolvedModel, "openai/gpt-5"); +}); + +test("resolveChildModel falls back to preset default model", () => { + const selection = resolveChildModel({ + callModel: undefined, + presetModel: "anthropic/claude-sonnet-4-5", + }); + + assert.equal(selection.requestedModel, "anthropic/claude-sonnet-4-5"); + assert.equal(selection.resolvedModel, "anthropic/claude-sonnet-4-5"); +}); +``` + +- [ ] **Step 2: Run tests to verify they fail for the intended reason** + +Run: + +```bash +npx tsx --test src/presets.test.ts src/models.test.ts +``` + +Expected: +- FAIL because `discoverSubagentPresets` does not exist yet +- FAIL because `resolveChildModel` still expects `topLevelModel` + +- [ ] **Step 3: Write the minimal preset-discovery and model-resolution code** + +`src/presets.ts` + +```ts +import { existsSync, readdirSync, readFileSync, statSync, type Dirent } from "node:fs"; +import { dirname, join } from "node:path"; +import { getAgentDir, parseFrontmatter } from "@mariozechner/pi-coding-agent"; + +export interface SubagentPreset { + name: string; + description: string; + model?: string; + tools?: string[]; + systemPrompt: string; + source: "global" | "project"; + filePath: string; +} + +export interface SubagentPresetDiscoveryResult { + presets: SubagentPreset[]; + projectPresetsDir: string | null; +} + +function findNearestProjectPresetsDir(cwd: string): string | null { + let current = cwd; + while (true) { + const candidate = join(current, ".pi", "subagents"); + try { + if (statSync(candidate).isDirectory()) return candidate; + } catch {} + const parent = dirname(current); + if (parent === current) return null; + current = parent; + } +} + +function loadPresetDir(dir: string, source: "global" | "project"): SubagentPreset[] { + if (!existsSync(dir)) return []; + const entries = readdirSync(dir, { withFileTypes: true }); + const presets: SubagentPreset[] = []; + + for (const entry of entries) { + if (!entry.name.endsWith(".md")) continue; + if (!entry.isFile() && !entry.isSymbolicLink()) continue; + const filePath = join(dir, entry.name); + const content = readFileSync(filePath, "utf8"); + const { frontmatter, body } = parseFrontmatter>(content); + if (typeof frontmatter.name !== "string" || typeof frontmatter.description !== "string") continue; + + const tools = typeof frontmatter.tools === "string" + ? frontmatter.tools.split(",").map((value) => value.trim()).filter(Boolean) + : undefined; + + presets.push({ + name: frontmatter.name, + description: frontmatter.description, + model: typeof frontmatter.model === "string" ? frontmatter.model : undefined, + tools: tools && tools.length > 0 ? tools : undefined, + systemPrompt: body.trim(), + source, + filePath, + }); + } + + return presets; +} + +export function discoverSubagentPresets(cwd: string, options: { homeDir?: string } = {}): SubagentPresetDiscoveryResult { + const globalDir = join(options.homeDir ?? getAgentDir(), "subagents"); + const projectPresetsDir = findNearestProjectPresetsDir(cwd); + const map = new Map(); + + for (const preset of loadPresetDir(globalDir, "global")) map.set(preset.name, preset); + if (projectPresetsDir) { + for (const preset of loadPresetDir(projectPresetsDir, "project")) map.set(preset.name, preset); + } + + return { presets: [...map.values()], projectPresetsDir }; +} +``` + +`src/models.ts` + +```ts +export function resolveChildModel(input: { + callModel?: string; + presetModel?: string; +}): ModelSelection { + const requestedModel = input.callModel ?? input.presetModel; + return { + requestedModel, + resolvedModel: requestedModel, + }; +} +``` + +- [ ] **Step 4: Run tests to verify they pass** + +Run: + +```bash +npx tsx --test src/presets.test.ts src/models.test.ts +``` + +Expected: +- PASS for all preset/model tests + +- [ ] **Step 5: Commit** + +```bash +git add src/presets.ts src/presets.test.ts src/models.ts src/models.test.ts +git commit -m "feat: add subagent preset discovery" +``` + +--- + +### Task 2: Rework `subagent` around presets and remove `chain` + +**Files:** +- Modify: `src/schema.ts` +- Modify: `src/tool.ts` +- Modify: `src/tool.test.ts` +- Modify: `src/tool-parallel.test.ts` +- Remove: `src/tool-chain.test.ts` +- Modify: `src/extension.test.ts` +- Test: `src/tool.test.ts` +- Test: `src/tool-parallel.test.ts` +- Test: `src/extension.test.ts` + +- [ ] **Step 1: Write failing schema/runtime tests for preset-only single and parallel modes** + +`src/tool.test.ts` + +```ts +test("single-mode subagent resolves preset prompt/tools/model and emits progress", async () => { + const updates: string[] = []; + + const tool = createSubagentTool({ + discoverSubagentPresets: () => ({ + presets: [{ + name: "repo-scout", + description: "Scout", + model: "anthropic/claude-sonnet-4-5", + tools: ["read", "grep"], + systemPrompt: "Explore quickly", + source: "global", + filePath: "/home/dev/.pi/agent/subagents/repo-scout.md", + }], + projectPresetsDir: null, + }), + runSingleTask: async ({ onEvent, meta }: any) => { + onEvent?.({ type: "tool_call", toolName: "read", args: { path: "src/auth.ts" } }); + return { + runId: "run-1", + preset: meta.preset, + task: meta.task, + requestedModel: meta.requestedModel, + resolvedModel: meta.resolvedModel, + exitCode: 0, + finalText: "done", + }; + }, + } as any); + + const result = await tool.execute( + "tool-1", + { preset: "repo-scout", task: "inspect auth" }, + undefined, + (partial: any) => { + const first = partial.content?.[0]; + if (first?.type === "text") updates.push(first.text); + }, + { + cwd: "/repo", + modelRegistry: { + getAvailable: () => [{ provider: "anthropic", id: "claude-sonnet-4-5" }], + }, + hasUI: false, + } as any, + ); + + assert.equal(result.isError, false); + assert.equal(result.details.mode, "single"); + assert.equal(result.details.results[0]?.requestedModel, "anthropic/claude-sonnet-4-5"); + assert.match(updates.join("\n"), /Running subagent/); +}); + +test("single-mode subagent errors when neither call nor preset supplies a model", async () => { + const tool = createSubagentTool({ + discoverSubagentPresets: () => ({ + presets: [{ + name: "repo-scout", + description: "Scout", + systemPrompt: "Explore quickly", + source: "global", + filePath: "/tmp/repo-scout.md", + }], + projectPresetsDir: null, + }), + } as any); + + const result = await tool.execute( + "tool-1", + { preset: "repo-scout", task: "inspect auth" }, + undefined, + undefined, + { + cwd: "/repo", + modelRegistry: { + getAvailable: () => [{ provider: "anthropic", id: "claude-sonnet-4-5" }], + }, + hasUI: false, + } as any, + ); + + assert.equal(result.isError, true); + assert.match(result.content[0]?.type === "text" ? result.content[0].text : "", /requires a model/i); +}); +``` + +`src/tool-parallel.test.ts` + +```ts +test("parallel mode requires each task to name a preset and lets each task resolve its own model", async () => { + const seenModels: string[] = []; + + const tool = createSubagentTool({ + discoverSubagentPresets: () => ({ + presets: [ + { name: "repo-scout", description: "Scout", model: "openai/gpt-5", systemPrompt: "Scout", source: "global", filePath: "/tmp/repo-scout.md" }, + { name: "repo-review", description: "Review", model: "anthropic/claude-opus-4-5", systemPrompt: "Review", source: "global", filePath: "/tmp/repo-review.md" }, + ], + projectPresetsDir: null, + }), + runSingleTask: async ({ meta }: any) => { + seenModels.push(meta.requestedModel); + return { runId: `run-${seenModels.length}`, task: meta.task, exitCode: 0, finalText: meta.task }; + }, + } as any); + + const result = await tool.execute( + "tool-1", + { + tasks: [ + { preset: "repo-scout", task: "inspect auth" }, + { preset: "repo-review", task: "review auth", model: "anthropic/claude-opus-4-5" }, + ], + }, + undefined, + undefined, + { + cwd: "/repo", + modelRegistry: { + getAvailable: () => [ + { provider: "openai", id: "gpt-5" }, + { provider: "anthropic", id: "claude-opus-4-5" }, + ], + }, + hasUI: false, + } as any, + ); + + assert.equal(result.isError, false); + assert.deepEqual(seenModels, ["openai/gpt-5", "anthropic/claude-opus-4-5"]); +}); +``` + +`src/extension.test.ts` + +```ts +assert.deepEqual(registeredTools[0]?.parameters.required, []); +assert.equal("preset" in registeredTools[0]?.parameters.properties, true); +assert.equal("chain" in registeredTools[0]?.parameters.properties, false); +assert.equal("preset" in registeredTools[0]?.parameters.properties.tasks.items.properties, true); +``` + +- [ ] **Step 2: Run tests to verify they fail** + +Run: + +```bash +npx tsx --test src/tool.test.ts src/tool-parallel.test.ts src/extension.test.ts +``` + +Expected: +- FAIL because schema still requires `model` +- FAIL because `preset` fields do not exist yet +- FAIL because runtime still expects `taskModel/topLevelModel` and supports `chain` + +- [ ] **Step 3: Write the minimal schema/runtime changes** + +`src/schema.ts` + +```ts +function createTaskItemSchema(availableModels: readonly string[]) { + return Type.Object({ + preset: Type.String({ description: "Name of the preset to invoke" }), + task: Type.String({ description: "Task to delegate to the child agent" }), + model: createTaskModelSchema(availableModels), + cwd: Type.Optional(Type.String({ description: "Optional working directory override" })), + }); +} + +export function createSubagentParamsSchema(availableModels: readonly string[]) { + return Type.Object({ + preset: Type.Optional(Type.String({ description: "Single-mode preset name" })), + task: Type.Optional(Type.String({ description: "Single-mode delegated task" })), + model: Type.Optional( + StringEnum(availableModels, { + description: "Optional child model override. Must be one of the currently available models.", + }), + ), + tasks: Type.Optional(Type.Array(createTaskItemSchema(availableModels), { description: "Parallel tasks" })), + cwd: Type.Optional(Type.String({ description: "Single-mode working directory override" })), + }); +} +``` + +`src/tool.ts` + +```ts +const hasSingle = Boolean(params.preset && params.task); +const hasParallel = Boolean(params.tasks?.length); +const modeCount = Number(hasSingle) + Number(hasParallel); +const mode = hasParallel ? "parallel" : "single"; + +const discovery = (deps.discoverSubagentPresets ?? discoverSubagentPresets)(ctx.cwd); +const resolvePreset = (name: string) => { + const preset = discovery.presets.find((candidate) => candidate.name === name); + if (!preset) throw new Error(`Unknown preset: ${name}`); + return preset; +}; + +const resolveEffectiveModel = (callModel: string | undefined, presetModel: string | undefined) => { + const requested = normalizeModelReference(callModel ?? presetModel); + if (!requested) { + throw new Error(`Preset run requires a model chosen from the available models: ${availableModelsText}`); + } + return requested; +}; + +const runTask = async (input: { presetName: string; task: string; cwd?: string; taskModel?: string; taskIndex?: number; mode: "single" | "parallel" }) => { + const preset = resolvePreset(input.presetName); + const requestedModel = resolveEffectiveModel(input.taskModel, preset.model); + return deps.runSingleTask?.({ + cwd: input.cwd ?? ctx.cwd, + onEvent(event) { + const text = progressFormatter.format(event); + if (!text) return; + onUpdate?.({ content: [{ type: "text", text }], details: makeDetails(input.mode, []) }); + }, + meta: { + mode: input.mode, + taskIndex: input.taskIndex, + preset: preset.name, + presetSource: preset.source, + task: input.task, + cwd: input.cwd ?? ctx.cwd, + requestedModel, + resolvedModel: requestedModel, + systemPrompt: preset.systemPrompt, + tools: preset.tools, + }, + }) as Promise; +}; +``` + +- [ ] **Step 4: Run tests to verify they pass** + +Run: + +```bash +npx tsx --test src/tool.test.ts src/tool-parallel.test.ts src/extension.test.ts +``` + +Expected: +- PASS for single/parallel preset flows +- PASS for extension schema assertions + +- [ ] **Step 5: Commit** + +```bash +git add src/schema.ts src/tool.ts src/tool.test.ts src/tool-parallel.test.ts src/extension.test.ts +git rm src/tool-chain.test.ts +git commit -m "feat: make subagent preset-driven" +``` + +--- + +### Task 3: Restore preset-owned prompt/tool artifacts in the wrapper path + +**Files:** +- Modify: `src/artifacts.ts` +- Modify: `src/artifacts.test.ts` +- Modify: `src/process-runner.ts` +- Modify: `src/process-runner.test.ts` +- Modify: `src/wrapper/cli.mjs` +- Modify: `src/wrapper/cli.test.ts` +- Modify: `src/wrapper/render.mjs` +- Modify: `src/wrapper/render.test.ts` +- Test: `src/artifacts.test.ts` +- Test: `src/process-runner.test.ts` +- Test: `src/wrapper/cli.test.ts` +- Test: `src/wrapper/render.test.ts` + +- [ ] **Step 1: Write failing tests for restored prompt/tool wiring** + +Add these assertions first. + +`src/artifacts.test.ts` + +```ts +const artifacts = await createRunArtifacts(cwd, { + runId: "run-1", + preset: "repo-scout", + systemPrompt: "Explore quickly", + tools: ["read", "grep"], +}); + +const meta = JSON.parse(await readFile(artifacts.metaPath, "utf8")); +assert.equal(meta.preset, "repo-scout"); +assert.deepEqual(meta.tools, ["read", "grep"]); +assert.equal(meta.systemPromptPath, join(artifacts.dir, "system-prompt.md")); +assert.equal(await readFile(artifacts.systemPromptPath, "utf8"), "Explore quickly"); +``` + +`src/wrapper/cli.test.ts` + +```ts +test("wrapper passes preset tool allowlist and system prompt file to pi when present", async () => { + const captured = await runWrapperWithFakePi("anthropic/claude-sonnet-4-5", undefined, { + tools: ["read", "grep"], + systemPrompt: "Explore quickly", + }); + + assert.equal(captured.flags.argv.includes("--tools"), true); + assert.equal(captured.flags.argv.includes("read,grep"), true); + assert.equal(captured.flags.argv.includes("--append-system-prompt"), true); +}); +``` + +`src/process-runner.test.ts` + +```ts +assert.equal(saved.preset, "repo-scout"); +assert.equal(saved.exitCode, 1); +``` + +- [ ] **Step 2: Run tests to verify they fail** + +Run: + +```bash +npx tsx --test src/artifacts.test.ts src/process-runner.test.ts src/wrapper/cli.test.ts src/wrapper/render.test.ts +``` + +Expected: +- FAIL because `systemPromptPath` and restored wrapper argv are missing + +- [ ] **Step 3: Write the minimal artifact/wrapper changes** + +`src/artifacts.ts` + +```ts +export interface RunArtifacts { + runId: string; + dir: string; + metaPath: string; + eventsPath: string; + resultPath: string; + stdoutPath: string; + stderrPath: string; + transcriptPath: string; + sessionPath: string; + systemPromptPath: string; +} + +export async function createRunArtifacts( + cwd: string, + meta: Record & { runId?: string; systemPrompt?: string }, +): Promise { + const artifacts: RunArtifacts = { + runId, + dir, + metaPath: join(dir, "meta.json"), + eventsPath: join(dir, "events.jsonl"), + resultPath: join(dir, "result.json"), + stdoutPath: join(dir, "stdout.log"), + stderrPath: join(dir, "stderr.log"), + transcriptPath: join(dir, "transcript.log"), + sessionPath: join(dir, "child-session.jsonl"), + systemPromptPath: join(dir, "system-prompt.md"), + }; + + await writeFile(artifacts.systemPromptPath, typeof meta.systemPrompt === "string" ? meta.systemPrompt : "", "utf8"); + // keep best-effort log file initialization unchanged +} +``` + +`src/wrapper/cli.mjs` + +```js +const args = ["--mode", "json", "--session", meta.sessionPath]; +if (effectiveModel) args.push("--model", effectiveModel); +if (Array.isArray(meta.tools) && meta.tools.length > 0) args.push("--tools", meta.tools.join(",")); +if (meta.systemPromptPath) args.push("--append-system-prompt", meta.systemPromptPath); +args.push(meta.task); +``` + +`src/wrapper/render.mjs` + +```js +export function renderHeader(meta) { + return [ + "=== subagent ===", + `Preset: ${meta.preset ?? "(none)"}`, + `Task: ${meta.task}`, + `CWD: ${meta.cwd}`, + `Requested model: ${meta.requestedModel ?? "(default)"}`, + `Resolved model: ${meta.resolvedModel ?? "(pending)"}`, + `Session: ${meta.sessionPath}`, + "---------------------", + ].join("\n"); +} +``` + +- [ ] **Step 4: Run tests to verify they pass** + +Run: + +```bash +npx tsx --test src/artifacts.test.ts src/process-runner.test.ts src/wrapper/cli.test.ts src/wrapper/render.test.ts +``` + +Expected: +- PASS for restored prompt/tool wiring and artifact metadata + +- [ ] **Step 5: Commit** + +```bash +git add src/artifacts.ts src/artifacts.test.ts src/process-runner.ts src/process-runner.test.ts src/wrapper/cli.mjs src/wrapper/cli.test.ts src/wrapper/render.mjs src/wrapper/render.test.ts +git commit -m "feat: restore preset wrapper metadata" +``` + +--- + +### Task 4: Add background registry and polling tool + +**Files:** +- Create: `src/background-schema.ts` +- Create: `src/background-registry.ts` +- Create: `src/background-registry.test.ts` +- Create: `src/background-status-tool.ts` +- Create: `src/background-status-tool.test.ts` +- Test: `src/background-registry.test.ts` +- Test: `src/background-status-tool.test.ts` + +- [ ] **Step 1: Write failing tests for counts, replay, and polling** + +`src/background-registry.test.ts` + +```ts +import test from "node:test"; +import assert from "node:assert/strict"; +import { createBackgroundRegistry } from "./background-registry.ts"; + +test("background registry tracks counts and applies terminal updates", () => { + const registry = createBackgroundRegistry(); + + registry.recordLaunch({ runId: "run-1", preset: "repo-scout", task: "inspect auth", status: "running" }); + registry.recordLaunch({ runId: "run-2", preset: "repo-review", task: "review auth", status: "running" }); + registry.recordUpdate({ runId: "run-2", status: "failed", finalText: "boom", exitCode: 1 }); + + assert.deepEqual(registry.getCounts(), { + running: 1, + completed: 0, + failed: 1, + aborted: 0, + total: 2, + }); +}); + +test("background registry rebuilds latest state from persisted entries", () => { + const registry = createBackgroundRegistry(); + registry.replay([ + { customType: "pi-subagents:bg-run", data: { runId: "run-1", preset: "repo-scout", task: "inspect auth", status: "running" } }, + { customType: "pi-subagents:bg-update", data: { runId: "run-1", status: "completed", exitCode: 0, finalText: "done" } }, + ] as any); + + const run = registry.getRun("run-1"); + assert.equal(run?.status, "completed"); + assert.equal(run?.finalText, "done"); +}); +``` + +`src/background-status-tool.test.ts` + +```ts +import test from "node:test"; +import assert from "node:assert/strict"; +import { createBackgroundAgentStatusTool } from "./background-status-tool.ts"; + +test("background_agent_status returns active runs by default and can target a single run", async () => { + const tool = createBackgroundAgentStatusTool({ + getSnapshot: () => ({ + counts: { running: 1, completed: 1, failed: 0, aborted: 0, total: 2 }, + runs: [ + { runId: "run-1", status: "running", preset: "repo-scout", task: "inspect auth" }, + { runId: "run-2", status: "completed", preset: "repo-review", task: "review auth", finalText: "done" }, + ], + }), + }); + + const active = await tool.execute("tool-1", {}, undefined, undefined, {} as any); + const single = await tool.execute("tool-2", { runId: "run-2" }, undefined, undefined, {} as any); + + assert.equal(active.isError, false); + assert.match(active.content[0]?.type === "text" ? active.content[0].text : "", /1 running \/ 2 total/); + assert.match(single.content[0]?.type === "text" ? single.content[0].text : "", /run-2/); + assert.match(single.content[0]?.type === "text" ? single.content[0].text : "", /done/); +}); +``` + +- [ ] **Step 2: Run tests to verify they fail** + +Run: + +```bash +npx tsx --test src/background-registry.test.ts src/background-status-tool.test.ts +``` + +Expected: +- FAIL because registry and status tool do not exist yet + +- [ ] **Step 3: Write the minimal registry and polling implementation** + +`src/background-schema.ts` + +```ts +import { StringEnum } from "@mariozechner/pi-ai"; +import { Type } from "@sinclair/typebox"; + +export function createBackgroundAgentParamsSchema(availableModels: readonly string[]) { + return Type.Object({ + preset: Type.String({ description: "Name of the preset to invoke" }), + task: Type.String({ description: "Task to delegate to the background agent" }), + model: Type.Optional( + StringEnum(availableModels, { + description: "Optional child model override. Must be one of the currently available models.", + }), + ), + cwd: Type.Optional(Type.String({ description: "Optional working directory override" })), + }); +} + +export const BackgroundAgentParamsSchema = createBackgroundAgentParamsSchema([]); + +export const BackgroundAgentStatusParamsSchema = Type.Object({ + runId: Type.Optional(Type.String({ description: "Inspect a specific background run" })), + includeCompleted: Type.Optional(Type.Boolean({ default: false })), +}); +``` + +`src/background-registry.ts` + +```ts +export interface BackgroundRunState { + runId: string; + preset: string; + task: string; + status: "running" | "completed" | "failed" | "aborted"; + requestedModel?: string; + resolvedModel?: string; + resultPath?: string; + eventsPath?: string; + stdoutPath?: string; + stderrPath?: string; + transcriptPath?: string; + sessionPath?: string; + startedAt?: string; + finishedAt?: string; + exitCode?: number; + stopReason?: string; + finalText?: string; + errorMessage?: string; +} + +export function createBackgroundRegistry() { + const runs = new Map(); + + const getCounts = () => { + const counts = { running: 0, completed: 0, failed: 0, aborted: 0, total: runs.size }; + for (const run of runs.values()) counts[run.status] += 1; + return counts; + }; + + return { + recordLaunch(run: BackgroundRunState) { runs.set(run.runId, run); }, + recordUpdate(update: Partial & { runId: string; status: BackgroundRunState["status"] }) { + const previous = runs.get(update.runId) ?? ({ runId: update.runId, preset: "(unknown)", task: "(unknown)" } as BackgroundRunState); + runs.set(update.runId, { ...previous, ...update }); + }, + replay(entries: Array<{ customType: string; data?: unknown }>) { + for (const entry of entries) { + if (entry.customType === "pi-subagents:bg-run") this.recordLaunch(entry.data as BackgroundRunState); + if (entry.customType === "pi-subagents:bg-update") this.recordUpdate(entry.data as any); + } + }, + getRun(runId: string) { return runs.get(runId); }, + getSnapshot(options: { includeCompleted?: boolean; runId?: string } = {}) { + const allRuns = [...runs.values()]; + const selected = options.runId + ? allRuns.filter((run) => run.runId === options.runId) + : options.includeCompleted + ? allRuns + : allRuns.filter((run) => run.status === "running"); + return { counts: getCounts(), runs: selected }; + }, + getCounts, + }; +} +``` + +`src/background-status-tool.ts` + +```ts +import { Text } from "@mariozechner/pi-tui"; +import { BackgroundAgentStatusParamsSchema } from "./background-schema.ts"; + +export function createBackgroundAgentStatusTool(deps: { + getSnapshot: (options?: { runId?: string; includeCompleted?: boolean }) => { + counts: { running: number; completed: number; failed: number; aborted: number; total: number }; + runs: Array>; + }; +}) { + return { + name: "background_agent_status", + label: "Background Agent Status", + description: "Poll background-agent counts and detached run status.", + parameters: BackgroundAgentStatusParamsSchema, + async execute(_toolCallId: string, params: any) { + const snapshot = deps.getSnapshot(params); + const lines = [ + `${snapshot.counts.running} running / ${snapshot.counts.total} total`, + ...snapshot.runs.map((run: any) => `${run.runId} ${run.status} ${run.preset}: ${run.task}${run.finalText ? ` => ${run.finalText}` : ""}`), + ]; + return { + content: [{ type: "text" as const, text: lines.join("\n") }], + details: snapshot, + isError: false, + }; + }, + renderCall() { + return new Text("background_agent_status", 0, 0); + }, + renderResult(result: { content: Array<{ type: string; text?: string }> }) { + const first = result.content[0]; + return new Text(first?.type === "text" ? first.text ?? "" : "", 0, 0); + }, + }; +} +``` + +- [ ] **Step 4: Run tests to verify they pass** + +Run: + +```bash +npx tsx --test src/background-registry.test.ts src/background-status-tool.test.ts +``` + +Expected: +- PASS for registry counting and polling behavior + +- [ ] **Step 5: Commit** + +```bash +git add src/background-schema.ts src/background-registry.ts src/background-registry.test.ts src/background-status-tool.ts src/background-status-tool.test.ts +git commit -m "feat: add background run registry" +``` + +--- + +### Task 5: Add detached launcher, extension integration, notifications, docs, and prompts + +**Files:** +- Create: `src/background-tool.ts` +- Create: `src/background-tool.test.ts` +- Modify: `src/process-runner.ts` +- Modify: `index.ts` +- Modify: `src/extension.test.ts` +- Modify: `README.md` +- Modify: `prompts/scout-and-plan.md` +- Modify: `prompts/implement.md` +- Modify: `prompts/implement-and-review.md` +- Modify: `src/prompts.test.ts` +- Modify: `AGENTS.md` +- Test: `src/background-tool.test.ts` +- Test: `src/extension.test.ts` +- Test: `src/prompts.test.ts` +- Test: `src/package-manifest.test.ts` + +- [ ] **Step 1: Write failing tests for detached launch and extension-side completion notifications** + +`src/background-tool.test.ts` + +```ts +import test from "node:test"; +import assert from "node:assert/strict"; +import { createBackgroundAgentTool } from "./background-tool.ts"; + +test("background_agent returns immediately with handle metadata and counts", async () => { + const appended: Array<{ type: string; data: any }> = []; + let watchCalled = false; + + const tool = createBackgroundAgentTool({ + discoverSubagentPresets: () => ({ + presets: [{ + name: "repo-scout", + description: "Scout", + model: "openai/gpt-5", + systemPrompt: "Scout", + source: "global", + filePath: "/tmp/repo-scout.md", + }], + projectPresetsDir: null, + }), + launchDetachedTask: async () => ({ + runId: "run-1", + task: "inspect auth", + requestedModel: "openai/gpt-5", + resolvedModel: "openai/gpt-5", + resultPath: "/repo/.pi/subagents/runs/run-1/result.json", + eventsPath: "/repo/.pi/subagents/runs/run-1/events.jsonl", + stdoutPath: "/repo/.pi/subagents/runs/run-1/stdout.log", + stderrPath: "/repo/.pi/subagents/runs/run-1/stderr.log", + transcriptPath: "/repo/.pi/subagents/runs/run-1/transcript.log", + sessionPath: "/repo/.pi/subagents/runs/run-1/child-session.jsonl", + }), + registerBackgroundRun(run: any) { + appended.push({ type: "run", data: run }); + return { running: 1, completed: 0, failed: 0, aborted: 0, total: 1 }; + }, + watchBackgroundRun() { + watchCalled = true; + }, + } as any); + + const result = await tool.execute( + "tool-1", + { preset: "repo-scout", task: "inspect auth" }, + undefined, + undefined, + { + cwd: "/repo", + modelRegistry: { getAvailable: () => [{ provider: "openai", id: "gpt-5" }] }, + hasUI: false, + } as any, + ); + + assert.equal(result.isError, false); + assert.equal(result.details.runId, "run-1"); + assert.deepEqual(result.details.counts, { running: 1, completed: 0, failed: 0, aborted: 0, total: 1 }); + assert.equal(watchCalled, true); + assert.equal(appended.length, 1); +}); +``` + +`src/extension.test.ts` + +```ts +test("background completion updates footer status, notifies UI, and injects a visible session message without auto-turn", async () => { + const sentMessages: any[] = []; + const appendedEntries: any[] = []; + const notifications: any[] = []; + const statuses: Array<{ key: string; text: string | undefined }> = []; + + // after extension boot + mocked watcher completion + assert.deepEqual(notifications, [{ message: "Background agent repo-scout finished: done", type: "info" }]); + assert.deepEqual(statuses.at(-1), { key: "pi-subagents", text: "bg: 0 running / 1 total" }); + assert.deepEqual(appendedEntries.at(-1), { + type: "pi-subagents:bg-update", + data: { runId: "run-1", status: "completed", finalText: "done", exitCode: 0 }, + }); + assert.deepEqual(sentMessages.at(-1), { + message: { + customType: "pi-subagents:bg-complete", + content: "Background agent repo-scout completed (run-1): done", + display: true, + details: { runId: "run-1", status: "completed" }, + }, + options: { triggerTurn: false }, + }); +}); +``` + +`src/prompts.test.ts` + +```ts +for (const name of ["implement.md", "scout-and-plan.md", "implement-and-review.md"]) { + const content = readFileSync(join(packageRoot, "prompts", name), "utf8"); + assert.doesNotMatch(content, /chain mode/i); + assert.match(content, /subagent/); +} +``` + +- [ ] **Step 2: Run tests to verify they fail** + +Run: + +```bash +npx tsx --test src/background-tool.test.ts src/extension.test.ts src/prompts.test.ts src/package-manifest.test.ts +``` + +Expected: +- FAIL because `background_agent` does not exist yet +- FAIL because extension does not maintain registry or notifications +- FAIL because shipped prompts still mention `chain` + +- [ ] **Step 3: Write the minimal detached-launch, extension, and docs changes** + +`src/process-runner.ts` + +```ts +export function createDetachedProcessLauncher(deps: { + createArtifacts: (cwd: string, meta: Record) => Promise; + buildWrapperSpawn: (metaPath: string) => { command: string; args: string[]; env?: NodeJS.ProcessEnv }; + spawnChild?: typeof spawn; +}) { + const spawnChild = deps.spawnChild ?? spawn; + + return async function launchDetachedTask(input: { cwd: string; meta: Record }) { + const artifacts = await deps.createArtifacts(input.cwd, input.meta); + const spawnSpec = deps.buildWrapperSpawn(artifacts.metaPath); + const child = spawnChild(spawnSpec.command, spawnSpec.args, { + cwd: input.cwd, + env: { ...process.env, ...(spawnSpec.env ?? {}) }, + stdio: ["ignore", "ignore", "ignore"], + }); + + child.once("error", async (error) => { + const result = makeLaunchFailureResult(artifacts, input.meta, input.cwd, error); + await writeFile(artifacts.resultPath, JSON.stringify(result, null, 2), "utf8"); + }); + + return { + runId: artifacts.runId, + sessionPath: artifacts.sessionPath, + stdoutPath: artifacts.stdoutPath, + stderrPath: artifacts.stderrPath, + transcriptPath: artifacts.transcriptPath, + resultPath: artifacts.resultPath, + eventsPath: artifacts.eventsPath, + ...input.meta, + }; + }; +} +``` + +`src/background-tool.ts` + +```ts +import { Text } from "@mariozechner/pi-tui"; +import { discoverSubagentPresets } from "./presets.ts"; +import { normalizeAvailableModelReference, listAvailableModelReferences, resolveChildModel } from "./models.ts"; +import { BackgroundAgentParamsSchema } from "./background-schema.ts"; + +export function createBackgroundAgentTool(deps: any) { + return { + name: "background_agent", + label: "Background Agent", + description: "Spawn a detached background child session and return its run handle immediately.", + parameters: deps.parameters ?? BackgroundAgentParamsSchema, + async execute(_toolCallId: string, params: any, _signal: AbortSignal | undefined, _onUpdate: any, ctx: any) { + const discovery = (deps.discoverSubagentPresets ?? discoverSubagentPresets)(ctx.cwd); + const preset = discovery.presets.find((candidate: any) => candidate.name === params.preset); + if (!preset) { + return { content: [{ type: "text" as const, text: `Unknown preset: ${params.preset}` }], isError: true }; + } + + const available = (deps.listAvailableModelReferences ?? listAvailableModelReferences)(ctx.modelRegistry); + const requestedModel = (deps.normalizeAvailableModelReference ?? normalizeAvailableModelReference)(params.model ?? preset.model, available); + if (!requestedModel) { + return { content: [{ type: "text" as const, text: `Background agent requires a model chosen from the available models: ${available.join(", ")}` }], isError: true }; + } + + const launched = await deps.launchDetachedTask({ + cwd: params.cwd ?? ctx.cwd, + meta: { + mode: "background", + preset: preset.name, + presetSource: preset.source, + task: params.task, + cwd: params.cwd ?? ctx.cwd, + requestedModel, + resolvedModel: requestedModel, + systemPrompt: preset.systemPrompt, + tools: preset.tools, + }, + }); + + const counts = deps.registerBackgroundRun({ ...launched, status: "running", preset: preset.name, task: params.task }); + deps.watchBackgroundRun(launched.runId); + + return { + content: [{ type: "text" as const, text: `Background agent started: ${launched.runId}` }], + details: { ...launched, counts }, + isError: false, + }; + }, + renderCall() { return new Text("background_agent", 0, 0); }, + renderResult(result: { content: Array<{ type: string; text?: string }> }) { + const first = result.content[0]; + return new Text(first?.type === "text" ? first.text ?? "" : "", 0, 0); + }, + }; +} +``` + +`index.ts` + +```ts +const registry = createBackgroundRegistry(); +let latestUi: { notify(message: string, type?: "info" | "warning" | "error"): void; setStatus(key: string, text: string | undefined): void } | undefined; + +function renderCounts() { + const counts = registry.getCounts(); + return counts.total === 0 ? undefined : `bg: ${counts.running} running / ${counts.total} total`; +} + +function updateStatus() { + latestUi?.setStatus("pi-subagents", renderCounts()); +} + +function watchBackgroundRun(runId: string) { + const run = registry.getRun(runId); + if (!run?.resultPath || !run.eventsPath) return; + + void monitorRun({ + eventsPath: run.eventsPath, + resultPath: run.resultPath, + }).then((result) => { + const status = result.stopReason === "aborted" ? "aborted" : result.exitCode === 0 ? "completed" : "failed"; + registry.recordUpdate({ runId, status, ...result }); + pi.appendEntry("pi-subagents:bg-update", { runId, status, finalText: result.finalText, exitCode: result.exitCode }); + updateStatus(); + latestUi?.notify(`Background agent ${run.preset} finished: ${result.finalText || status}`, status === "completed" ? "info" : "error"); + pi.sendMessage({ + customType: "pi-subagents:bg-complete", + content: `Background agent ${run.preset} completed (${runId}): ${result.finalText || status}`, + display: true, + details: { runId, status }, + }, { triggerTurn: false }); + }); +} + +pi.on("session_start", (_event, ctx) => { + latestUi = ctx.ui; + registry.replay(ctx.sessionManager.getEntries().filter((entry: any) => entry.type === "custom") as any); + updateStatus(); + for (const run of registry.getSnapshot({ includeCompleted: true }).runs) { + if (run.status === "running") watchBackgroundRun(run.runId as string); + } + syncTools(ctx); +}); + +pi.on("before_agent_start", (_event, ctx) => { + latestUi = ctx.ui; + syncTools(ctx); +}); + +pi.registerTool(createBackgroundAgentTool({ launchDetachedTask, registerBackgroundRun, watchBackgroundRun, parameters: createBackgroundAgentParamsSchema(availableModels) })); +pi.registerTool(createBackgroundAgentStatusTool({ getSnapshot: (options) => registry.getSnapshot(options) })); +``` + +`README.md` + +```md +## Presets + +Named presets live in: +- `~/.pi/agent/subagents/*.md` +- nearest project `.pi/subagents/*.md` + +Project presets override global presets by name. + +Preset frontmatter may define: +- `name` +- `description` +- `model` +- `tools` + +Preset body is appended to the child session system prompt. +`tools` maps to Pi CLI `--tools`, so it limits built-in tools only. + +## Tools + +- `subagent` — foreground single or parallel runs using named presets +- `background_agent` — detached process-backed run that returns immediately +- `background_agent_status` — poll background counts and run status +``` + +`prompts/scout-and-plan.md` + +```md +--- +description: Inspect the codebase, then produce a plan using preset-driven subagents +--- +Run `subagent` once to inspect the codebase with an appropriate preset. +Then run `subagent` again with a planning preset, using the first result as context. + +User request: $@ +``` + +- [ ] **Step 4: Run tests to verify they pass** + +Run: + +```bash +npx tsx --test src/background-tool.test.ts src/extension.test.ts src/prompts.test.ts src/package-manifest.test.ts +``` + +Expected: +- PASS for detached launch, notification behavior, and prompt/docs assertions + +- [ ] **Step 5: Run the full suite and commit** + +Run: + +```bash +npm test +``` + +Expected: +- PASS with 0 failures + +Then commit: + +```bash +git add index.ts src/background-tool.ts src/background-tool.test.ts src/process-runner.ts README.md prompts/*.md src/prompts.test.ts AGENTS.md +git commit -m "feat: add background preset subagents" +``` + +--- + +## Plan self-review + +- Spec coverage checked: + - preset discovery, override rules, and model defaults: Task 1 + - `subagent` preset API and chain removal: Task 2 + - restored wrapper prompt/tool support: Task 3 + - background counts, polling, persistence, reload: Task 4 + - detached launch, notifications, docs, and prompt rewrites: Task 5 +- Placeholder scan checked: + - no `TODO`, `TBD`, or “handle edge cases” placeholders remain +- Type consistency checked: + - tool names, entry types, and schema field names match across tasks diff --git a/docs/superpowers/specs/2026-04-12-subagent-presets-and-background-agents-design.md b/docs/superpowers/specs/2026-04-12-subagent-presets-and-background-agents-design.md new file mode 100644 index 0000000..4a71729 --- /dev/null +++ b/docs/superpowers/specs/2026-04-12-subagent-presets-and-background-agents-design.md @@ -0,0 +1,440 @@ +# Subagent presets and background agents design + +Date: 2026-04-12 +Status: approved for planning + +## Summary + +Evolve `pi-subagents` from generic foreground-only child runs into a preset-driven delegation package with three tools: +- `subagent` for single or parallel foreground runs +- `background_agent` for detached process-backed runs +- `background_agent_status` for polling detached-run status and counts + +Named customization comes back as markdown presets, but **without** bundled built-in roles. Presets live in: +- global: `~/.pi/agent/subagents/*.md` +- project: nearest `.pi/subagents/*.md` found by walking up from the current `cwd` + +Project presets override global presets by name. Calls must name a preset. Per-call overrides are limited to `model`; prompt and tool access come only from the preset. + +## Current state + +Today the package has: +- one `subagent` tool with `task | tasks | chain` +- no background-run registry or polling tool +- no preset discovery +- no wrapper support for preset-owned prompt/tool restrictions +- prompt templates that still assume `chain` + +The old role system was removed entirely, including markdown discovery, `--tools`, and `--append-system-prompt` wiring. This change brings back the useful customization pieces without reintroducing bundled roles or old role-specific behavior. + +## Goals + +- Remove `chain` mode from `subagent`. +- Keep foreground single-run and parallel-run delegation. +- Add named preset discovery from global and project markdown directories. +- Let presets define appended prompt text, built-in tool allowlist, and optional default model. +- Add `background_agent` as a process-only detached launcher. +- Add `background_agent_status` so the parent agent can poll one run or many runs. +- Track background-run counts and surface completion through UI notification plus a visible session message. +- Preserve existing runner behavior for foreground runs and keep runner-specific changes minimal. + +## Non-goals + +- No bundled built-in presets (`scout`, `planner`, `reviewer`, `worker`, etc.). +- No markdown discovery from `.pi/agents`. +- No inline prompt or tool overrides on tool calls. +- No background tmux mode. `background_agent` always uses the process runner. +- No automatic follow-up turn when a background run finishes. +- No attempt to restrict third-party extension tools beyond what Pi CLI already supports. + +## Chosen approach + +Add a small preset layer and a small background-run state layer on top of the current runner/wrapper core. + +- foreground `subagent` becomes preset-aware and loses `chain` +- wrapper/artifacts regain preset-owned prompt/tool support +- background runs reuse process-launch mechanics but skip `monitorRun()` in the calling tool +- extension-owned registry watches detached runs, persists state to session custom entries, updates footer status text, emits UI notifications, and injects visible completion messages into session history + +This keeps most existing code paths intact while restoring customization and adding detached orchestration. + +## Public API design + +### `subagent` + +Supported modes: + +#### Single mode + +Required: +- `preset` +- `task` + +Optional: +- `model` +- `cwd` + +#### Parallel mode + +Required: +- `tasks: Array<{ preset: string; task: string; model?: string; cwd?: string }>` + +Notes: +- each parallel item names its own preset +- there is no top-level default preset +- there is no top-level required model + +Removed: +- `chain` + +Model resolution order per run: +1. call-level `model` +2. preset `model` +3. error: no model resolved + +### `background_agent` + +Single-run only. + +Required: +- `preset` +- `task` + +Optional: +- `model` +- `cwd` + +Behavior: +- always launches with the process runner, ignoring tmux config +- returns immediately after spawn request +- returns run handle metadata plus counts snapshot +- does not wait for completion + +### `background_agent_status` + +Purpose: +- let the main agent poll background runs and inspect counts + +Parameters: +- `runId?` — inspect one run +- `includeCompleted?` — default `false`; when omitted, only active runs are listed unless `runId` is provided + +Returns: +- counts: `running`, `completed`, `failed`, `aborted`, `total` +- per-run rows with preset, task, cwd, model info, timestamps, artifact paths, and status +- final result fields when a run is terminal + +## Preset design + +### Discovery + +Load presets from two sources: +- global: `join(getAgentDir(), "subagents")` +- project: nearest ancestor directory containing `.pi/subagents` + +Merge order: +1. global presets +2. project presets override global presets with the same `name` + +No confirmation gate for project presets. + +### File format + +Each preset is one markdown file with frontmatter and body. + +Required frontmatter: +- `name` +- `description` + +Optional frontmatter: +- `model` +- `tools` + +Body: +- appended system prompt text + +Example: + +```md +--- +name: repo-scout +description: Fast repo exploration +model: github-copilot/gpt-4o +tools: read,grep,find,ls +--- +You are a scout. Explore quickly, summarize clearly, and avoid implementation. +``` + +### Preset semantics + +- `model` is the default model when the call does not provide one. +- `tools` is optional. +- omitted `tools` means normal child-tool behavior (no built-in tool restriction) +- when `tools` is present, pass it through Pi CLI `--tools`, which limits built-in tools only +- prompt text comes only from the markdown body; no inline prompt override + +## Runtime design + +### Foreground subagent execution + +`src/tool.ts` becomes preset-aware. + +For each run: +1. discover presets +2. resolve the named preset +3. normalize explicit `model` override against available models if present +4. normalize preset default model against available models if used +5. compute effective model from call override or preset default +6. pass runner metadata including preset, prompt text, built-in tool allowlist, and model selection + +Parallel behavior stays the same apart from: +- no `chain` +- each task resolving its own preset/model +- summary lines identifying tasks by index and/or preset, not old role names + +### Background execution + +`background_agent` launches the wrapper via process-runner primitives but returns immediately. + +Flow: +1. resolve preset + effective model +2. create run artifacts +3. spawn wrapper process +4. register run in extension background registry as `running` +5. append persistent session entry for the new run +6. start detached watcher on that run’s `result.json` / `events.jsonl` +7. return handle metadata and counts snapshot + +Background runs are process-only even when normal `subagent` foreground runs are configured for tmux. + +### Background registry + +The extension owns a session-scoped registry keyed by `runId`. + +Stored metadata per run: +- `runId` +- `preset` +- `task` +- `cwd` +- `requestedModel` +- `resolvedModel` +- artifact paths +- timestamps +- terminal result fields when available +- status: `running | completed | failed | aborted` + +The registry also computes counts: +- `running` +- `completed` +- `failed` +- `aborted` +- `total` + +### Persistence and reload behavior + +Persist background state with session custom entries. + +Custom entry types: +- `pi-subagents:bg-run` — initial launch metadata +- `pi-subagents:bg-update` — later status/result transitions + +On `session_start`, rebuild the in-memory registry by scanning `ctx.sessionManager.getEntries()`. + +For rebuilt runs that are still non-terminal: +- if `result.json` already exists, ingest it immediately +- otherwise reattach a watcher so completion still updates counts, notifications, and session messages after reload/resume + +## Notification and polling design + +### Completion notification + +When a detached run becomes terminal: +1. update registry and counts +2. append `pi-subagents:bg-update` +3. update footer status text, e.g. `bg: 2 running / 5 total` +4. emit UI notification if UI is available +5. inject a visible custom session message describing completion + +The completion message must **not** trigger a new agent turn automatically. + +### Polling + +The parent agent polls with `background_agent_status`. + +Typical use: +- ask for current running count +- list active background runs +- inspect one `runId` +- fetch terminal result summary after notification arrives + +## Wrapper and artifact design + +### Artifact layout + +Keep run directories under: +- `.pi/subagents/runs//` + +Keep existing files and restore the removed prompt artifact when needed: +- `meta.json` +- `events.jsonl` +- `result.json` +- `stdout.log` +- `stderr.log` +- `transcript.log` +- `child-session.jsonl` +- `system-prompt.md` + +### Metadata + +Keep existing generic bookkeeping and add preset-specific fields: +- `preset` +- `presetSource` +- `tools` +- `systemPrompt` +- `systemPromptPath` + +Do not reintroduce old bundled-role concepts or role-only behavior. + +### Child wrapper + +`src/wrapper/cli.mjs` should again support: +- `--append-system-prompt ` when preset prompt text exists +- `--tools ` when preset `tools` exists + +Keep: +- `PI_SUBAGENTS_CHILD=1` +- github-copilot initiator behavior based on effective model +- best-effort artifact appends that must never block writing `result.json` +- semantic-completion exit handling + +## Extension registration behavior + +Keep existing model-registration behavior for model-dependent tools: +- preserve current available-model order for schema enums +- do not mutate available-model arrays when deduping cache keys +- re-register when model set changes +- do not re-register when model set is the same in different order +- if the first observed set is empty, a later non-empty set must still register +- skip tool registration entirely when `PI_SUBAGENTS_CHILD=1` + +Register: +- `subagent` +- `background_agent` +- `background_agent_status` + +`background_agent_status` does not need model parameters, but registration still follows the package’s existing model-availability gate for consistency. + +## Prompt and documentation design + +Rewrite shipped prompts so they no longer mention `chain` mode. + +They should instead describe repeated `subagent` calls or `subagent.tasks` parallel calls, for example: +- inspect with a preset +- turn findings into a plan with another preset +- implement or review in separate follow-up calls + +README and docs should describe: +- preset directories and markdown format +- `background_agent` +- `background_agent_status` +- background completion notification behavior +- background count tracking +- process-only behavior for detached runs +- built-in-tool-only semantics of preset `tools` + +They should continue to avoid claiming bundled built-in roles. + +## File-level impact + +Expected new files: +- `src/presets.ts` +- `src/presets.test.ts` +- `src/background-registry.ts` +- `src/background-registry.test.ts` +- `src/background-schema.ts` +- `src/background-tool.ts` +- `src/background-tool.test.ts` +- `src/background-status-tool.ts` +- `src/background-status-tool.test.ts` + +Expected modified files: +- `index.ts` +- `src/schema.ts` +- `src/tool.ts` +- `src/models.ts` +- `src/runner.ts` and/or `src/process-runner.ts` +- `src/artifacts.ts` +- `src/process-runner.test.ts` +- `src/extension.test.ts` +- `src/artifacts.test.ts` +- `src/wrapper/cli.mjs` +- `src/wrapper/cli.test.ts` +- `src/wrapper/render.mjs` +- `src/wrapper/render.test.ts` +- `src/prompts.test.ts` +- `README.md` +- `prompts/*.md` +- `AGENTS.md` + +Expected removals: +- `src/tool-chain.test.ts` + +## Testing plan + +Implementation should follow TDD. + +### New coverage + +Add tests for: +- preset discovery from global and project directories +- project preset override by name +- required preset selection in single and parallel mode +- model resolution from call override vs preset default +- error when neither call nor preset supplies a valid model +- chain removal from schema and runtime +- detached background launch returning immediately +- background registry counts +- session-entry persistence and reload reconstruction +- completion notification emitting UI notify + visible session message without auto-turn +- polling one background run and many background runs + +### Preserved coverage + +Keep regression coverage for: +- child sessions skipping subagent tool registration when `PI_SUBAGENTS_CHILD=1` +- no tool registration when no models are available +- later registration when a non-empty model list appears +- no re-registration for the same model set in different order +- re-registration when the model set changes +- github-copilot initiator behavior +- best-effort artifact logging never preventing `result.json` writes +- effective model using the resolved model when requested/resolved differ + +## Risks and mitigations + +### Risk: preset `tools` sounds broader than Pi can enforce + +Mitigation: +- document that preset `tools` maps to Pi CLI `--tools` +- treat it as a built-in tool allowlist only +- keep `PI_SUBAGENTS_CHILD=1` so this package never re-registers its own subagent tools inside child runs + +### Risk: detached watcher state lost on reload + +Mitigation: +- persist launch/update events as session custom entries +- rebuild registry on `session_start` +- reattach watchers for non-terminal runs + +### Risk: background notifications spam the user + +Mitigation: +- emit one terminal notification per run +- keep footer count compact +- require explicit polling for detailed inspection + +### Risk: larger extension entrypoint + +Mitigation: +- keep preset discovery, registry/state, and tool definitions in separate focused modules +- keep runner-specific logic in existing runner files with minimal changes