Files
dotfiles/docs/superpowers/specs/2026-04-09-question-tool-design.md
2026-04-09 09:45:00 +01:00

280 lines
8.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Question Tool Design
**Date:** 2026-04-09
**Project:** `/home/alex/dotfiles`
**Target file:** `.pi/agent/extensions/question.ts`
## Goal
Add a tracked pi extension that gives the agent a single `question` tool for asking either one question or multiple questions in interactive mode, while always preserving a final user escape hatch: **"Something else…"** opens inline free-text entry when none of the listed options fit.
## Context
- Pi supports custom tools through TypeScript extensions placed in auto-discovered extension directories.
- This dotfiles repo already tracks pi configuration under `.pi/agent/`.
- The working extension directory `.pi/agent/extensions/` is currently empty.
- Pis upstream examples already include:
- a single-question `question.ts` example
- a multi-question `questionnaire.ts` example
- The requested tool should combine those use cases into one obvious agent-facing tool.
## User-Approved Requirements
1. The tool must be tracked in this repo at:
- `/home/alex/dotfiles/.pi/agent/extensions/question.ts`
2. The tool name should be:
- `question`
3. The tool must support both:
- a single question
- multiple questions in one interaction
4. Every question is multiple-choice, but the UI must always append a final choice:
- **"Something else…"**
5. Choosing **"Something else…"** must allow direct user text entry.
6. Question options should support machine-friendly values and user-facing labels:
- `{ value, label, description? }`
7. This should be a unified tool, not separate `question` and `questionnaire` tools.
## Recommended Approach
Implement a single extension modeled after pis upstream `question.ts` and `questionnaire.ts` examples:
- one registered tool: `question`
- one parameter shape: `questions: Question[]`
- one UI that adapts to question count:
- single-question picker for `questions.length === 1`
- multi-question review flow for `questions.length > 1`
This keeps the agent-facing API simple while still supporting richer user clarification flows.
## Tool Contract
The extension will register a tool with this conceptual input shape:
```ts
{
questions: Array<{
id: string;
label?: string;
prompt: string;
options: Array<{
value: string;
label: string;
description?: string;
}>;
}>;
}
```
### Field intent
- `id`: stable identifier for the answer
- `label`: short summary label for tabs/review UI; defaults to `Q1`, `Q2`, etc.
- `prompt`: the full question shown to the user
- `options`: predefined choices the model wants the user to pick from
### Normalization rules
Before rendering the UI:
1. Ensure at least one question exists.
2. Ensure each question has a usable short label.
3. Preserve the provided predefined options as-is.
4. Append a final synthetic option to every question:
- label: `Something else…`
- behavior: switch into inline text entry
5. Do not require the model to explicitly include the synthetic option.
## Interaction Design
### Single-question mode
When exactly one question is provided:
- display the prompt
- display numbered predefined options
- automatically display the final appended option:
- `Something else…`
- selecting a predefined option completes the tool immediately
- selecting `Something else…` opens inline free-text entry
- `Esc` in the picker cancels the tool
- `Esc` in text entry exits text entry and returns to the option list
### Multi-question mode
When multiple questions are provided:
- show one question at a time
- allow tab or left/right navigation between questions
- append `Something else…` to every question
- after answering one question, move to the next question
- include a final review/submit step summarizing all current answers
- allow navigating back to change answers before final submission
- submit only from the review step
This provides a guided flow without requiring separate tools.
## Answer Model
The tool result should always remain structured.
Conceptual result shape:
```ts
{
questions: Question[];
answers: Array<{
id: string;
value: string;
label: string;
wasCustom: boolean;
index?: number;
}>;
cancelled: boolean;
}
```
### Predefined option answers
For a predefined choice:
- `value` = the provided option value
- `label` = the provided option label
- `wasCustom` = `false`
- `index` = 1-based index of the selected predefined option
### Custom answers via “Something else…”
For a typed answer:
- `value` = typed text
- `label` = typed text
- `wasCustom` = `true`
- `index` is omitted
This gives the agent consistent structured data while preserving user freedom.
## Rendering
The extension should provide readable tool renderers:
### `renderCall`
Show:
- tool name (`question`)
- question count
- short labels or summary where useful
### `renderResult`
Show:
- `Cancelled` when the user aborts
- one concise success line per answered question
- whether an answer was predefined or custom when helpful
The rendering should remain compact in normal use and not dump full raw JSON unless the default fallback is needed.
## Error Handling
The tool should return structured results for expected user/runtime states instead of throwing.
### Non-interactive mode
If pi is running without interactive UI support:
- return a clear text result indicating UI is unavailable
- mark the interaction as `cancelled: true` in details
- do not crash the session
### Invalid input
If `questions` is empty:
- return a clear text result like `Error: No questions provided`
- include a structured details payload with `cancelled: true`
### User cancel
If the user cancels from the picker or review flow:
- return `cancelled: true`
- do not throw an exception
### Empty custom text
If the user enters free-text mode and submits an empty value:
- do not accept an empty answer
- keep the user in text-entry mode until they provide non-empty text or press `Esc`
- avoid returning meaningless blank answers to the model
## File Structure
Implementation stays in one file unless complexity clearly justifies splitting later:
- Create: `/home/alex/dotfiles/.pi/agent/extensions/question.ts`
Internal sections inside the file should stay logically separated:
1. types and schemas
2. question normalization helpers
3. single-question UI flow
4. multi-question UI flow
5. tool registration
6. call/result rendering
## Loading and Usage
Because the file will live in an auto-discovered project extension directory, the expected activation flow is:
1. start pi from the dotfiles repo or a directory where the project extension is in scope
2. use `/reload` if pi is already running
3. allow the model to call `question` when clarification is needed
## Testing Strategy
No dedicated automated test harness is required for the first version.
Manual verification should cover:
1. **Single question, predefined answer**
- tool returns selected option value/label
2. **Single question, custom answer**
- selecting `Something else…` opens text entry and returns typed text
3. **Single question, cancel**
- cancellation returns structured cancelled result
4. **Multi-question, all predefined**
- step-through and final review work correctly
5. **Multi-question, mixed predefined/custom**
- at least one typed answer and one predefined answer are preserved correctly
6. **Multi-question, edit before submit**
- user can revisit and change answers before final submission
7. **Empty custom submission**
- blank text is rejected or bounced back safely
8. **Non-interactive mode**
- tool returns a clear UI-unavailable result
## Non-Goals
The first version will not add:
- separate text-only question types
- nested conditional question trees
- validation rules beyond basic non-empty custom text handling
- persistence beyond normal pi session/tool result storage
- a separate `questionnaire` tool name
## Acceptance Criteria
The work is complete when:
1. `.pi/agent/extensions/question.ts` exists in this repo
2. pi discovers the extension via project auto-discovery
3. the agent has a single `question` tool
4. the tool supports both one-question and multi-question flows
5. every question automatically ends with `Something else…`
6. selecting `Something else…` allows direct typed input
7. results are structured and distinguish custom answers from predefined ones
8. cancel/error states return cleanly without crashing the session