Files
dotfiles/docs/superpowers/specs/2026-04-09-question-tool-design.md
2026-04-09 09:45:00 +01:00

8.3 KiB
Raw Blame History

Question Tool Design

Date: 2026-04-09 Project: /home/alex/dotfiles Target file: .pi/agent/extensions/question.ts

Goal

Add a tracked pi extension that gives the agent a single question tool for asking either one question or multiple questions in interactive mode, while always preserving a final user escape hatch: "Something else…" opens inline free-text entry when none of the listed options fit.

Context

  • Pi supports custom tools through TypeScript extensions placed in auto-discovered extension directories.
  • This dotfiles repo already tracks pi configuration under .pi/agent/.
  • The working extension directory .pi/agent/extensions/ is currently empty.
  • Pis upstream examples already include:
    • a single-question question.ts example
    • a multi-question questionnaire.ts example
  • The requested tool should combine those use cases into one obvious agent-facing tool.

User-Approved Requirements

  1. The tool must be tracked in this repo at:
    • /home/alex/dotfiles/.pi/agent/extensions/question.ts
  2. The tool name should be:
    • question
  3. The tool must support both:
    • a single question
    • multiple questions in one interaction
  4. Every question is multiple-choice, but the UI must always append a final choice:
    • "Something else…"
  5. Choosing "Something else…" must allow direct user text entry.
  6. Question options should support machine-friendly values and user-facing labels:
    • { value, label, description? }
  7. This should be a unified tool, not separate question and questionnaire tools.

Implement a single extension modeled after pis upstream question.ts and questionnaire.ts examples:

  • one registered tool: question
  • one parameter shape: questions: Question[]
  • one UI that adapts to question count:
    • single-question picker for questions.length === 1
    • multi-question review flow for questions.length > 1

This keeps the agent-facing API simple while still supporting richer user clarification flows.

Tool Contract

The extension will register a tool with this conceptual input shape:

{
  questions: Array<{
    id: string;
    label?: string;
    prompt: string;
    options: Array<{
      value: string;
      label: string;
      description?: string;
    }>;
  }>;
}

Field intent

  • id: stable identifier for the answer
  • label: short summary label for tabs/review UI; defaults to Q1, Q2, etc.
  • prompt: the full question shown to the user
  • options: predefined choices the model wants the user to pick from

Normalization rules

Before rendering the UI:

  1. Ensure at least one question exists.
  2. Ensure each question has a usable short label.
  3. Preserve the provided predefined options as-is.
  4. Append a final synthetic option to every question:
    • label: Something else…
    • behavior: switch into inline text entry
  5. Do not require the model to explicitly include the synthetic option.

Interaction Design

Single-question mode

When exactly one question is provided:

  • display the prompt
  • display numbered predefined options
  • automatically display the final appended option:
    • Something else…
  • selecting a predefined option completes the tool immediately
  • selecting Something else… opens inline free-text entry
  • Esc in the picker cancels the tool
  • Esc in text entry exits text entry and returns to the option list

Multi-question mode

When multiple questions are provided:

  • show one question at a time
  • allow tab or left/right navigation between questions
  • append Something else… to every question
  • after answering one question, move to the next question
  • include a final review/submit step summarizing all current answers
  • allow navigating back to change answers before final submission
  • submit only from the review step

This provides a guided flow without requiring separate tools.

Answer Model

The tool result should always remain structured.

Conceptual result shape:

{
  questions: Question[];
  answers: Array<{
    id: string;
    value: string;
    label: string;
    wasCustom: boolean;
    index?: number;
  }>;
  cancelled: boolean;
}

Predefined option answers

For a predefined choice:

  • value = the provided option value
  • label = the provided option label
  • wasCustom = false
  • index = 1-based index of the selected predefined option

Custom answers via “Something else…”

For a typed answer:

  • value = typed text
  • label = typed text
  • wasCustom = true
  • index is omitted

This gives the agent consistent structured data while preserving user freedom.

Rendering

The extension should provide readable tool renderers:

renderCall

Show:

  • tool name (question)
  • question count
  • short labels or summary where useful

renderResult

Show:

  • Cancelled when the user aborts
  • one concise success line per answered question
  • whether an answer was predefined or custom when helpful

The rendering should remain compact in normal use and not dump full raw JSON unless the default fallback is needed.

Error Handling

The tool should return structured results for expected user/runtime states instead of throwing.

Non-interactive mode

If pi is running without interactive UI support:

  • return a clear text result indicating UI is unavailable
  • mark the interaction as cancelled: true in details
  • do not crash the session

Invalid input

If questions is empty:

  • return a clear text result like Error: No questions provided
  • include a structured details payload with cancelled: true

User cancel

If the user cancels from the picker or review flow:

  • return cancelled: true
  • do not throw an exception

Empty custom text

If the user enters free-text mode and submits an empty value:

  • do not accept an empty answer
  • keep the user in text-entry mode until they provide non-empty text or press Esc
  • avoid returning meaningless blank answers to the model

File Structure

Implementation stays in one file unless complexity clearly justifies splitting later:

  • Create: /home/alex/dotfiles/.pi/agent/extensions/question.ts

Internal sections inside the file should stay logically separated:

  1. types and schemas
  2. question normalization helpers
  3. single-question UI flow
  4. multi-question UI flow
  5. tool registration
  6. call/result rendering

Loading and Usage

Because the file will live in an auto-discovered project extension directory, the expected activation flow is:

  1. start pi from the dotfiles repo or a directory where the project extension is in scope
  2. use /reload if pi is already running
  3. allow the model to call question when clarification is needed

Testing Strategy

No dedicated automated test harness is required for the first version.

Manual verification should cover:

  1. Single question, predefined answer
    • tool returns selected option value/label
  2. Single question, custom answer
    • selecting Something else… opens text entry and returns typed text
  3. Single question, cancel
    • cancellation returns structured cancelled result
  4. Multi-question, all predefined
    • step-through and final review work correctly
  5. Multi-question, mixed predefined/custom
    • at least one typed answer and one predefined answer are preserved correctly
  6. Multi-question, edit before submit
    • user can revisit and change answers before final submission
  7. Empty custom submission
    • blank text is rejected or bounced back safely
  8. Non-interactive mode
    • tool returns a clear UI-unavailable result

Non-Goals

The first version will not add:

  • separate text-only question types
  • nested conditional question trees
  • validation rules beyond basic non-empty custom text handling
  • persistence beyond normal pi session/tool result storage
  • a separate questionnaire tool name

Acceptance Criteria

The work is complete when:

  1. .pi/agent/extensions/question.ts exists in this repo
  2. pi discovers the extension via project auto-discovery
  3. the agent has a single question tool
  4. the tool supports both one-question and multi-question flows
  5. every question automatically ends with Something else…
  6. selecting Something else… allows direct typed input
  7. results are structured and distinguish custom answers from predefined ones
  8. cancel/error states return cleanly without crashing the session