13 KiB
Web Search Tools Design
Date: 2026-04-09
Project: /home/alex/dotfiles
Target files:
.pi/agent/extensions/web-search/package.json.pi/agent/extensions/web-search/index.ts.pi/agent/extensions/web-search/src/schema.ts.pi/agent/extensions/web-search/src/config.ts.pi/agent/extensions/web-search/src/providers/types.ts.pi/agent/extensions/web-search/src/providers/exa.ts.pi/agent/extensions/web-search/src/tools/web-search.ts.pi/agent/extensions/web-search/src/tools/web-fetch.ts.pi/agent/extensions/web-search/src/format.ts- tests alongside the new modules
Goal
Add two generic pi tools, web_search and web_fetch, implemented as a modular extension package that uses Exa as the first provider while keeping the internal design extensible for future providers.
Context
- This dotfiles repo already tracks pi configuration under
.pi/agent/. - The current extension workspace contains a tracked
questionextension and small pure helper tests. - Pi extensions can be packaged as directories with
index.tsand their ownpackage.json, which is the best fit when third-party dependencies are needed. - The requested feature is explicitly about pi extensions and custom tools, not built-in model providers.
- The user wants:
- generic tool names now
- Exa as the first provider
- configuration read from a separate global file, not
settings.json - configuration stored only at the global scope
User-Approved Requirements
- Add two generic tools:
web_searchweb_fetch
- Use Exa as the initial provider.
- Keep the implementation extensible so other providers can be added later.
- Do not read configuration from environment variables.
- Do not read configuration from
settings.json. - Read configuration from a dedicated global file:
~/.pi/agent/web-search.json
- Use a provider-list-based config shape, not a single-provider-only schema.
- Store credentials as literal values in that config file.
web_searchshould return metadata only by default.web_fetchshould accept one URL or multiple URLs.web_fetchshould return text by default.- The implementation direction should be the modular/package-style structure, not the minimal Exa-shaped shortcut.
Recommended Architecture
Implement the feature as a dedicated extension package at:
/home/alex/dotfiles/.pi/agent/extensions/web-search/
This package will register two generic tools and route both through a provider registry. At runtime, the extension loads ~/.pi/agent/web-search.json, validates it, normalizes the provider list into an internal lookup map, resolves the configured default provider, and then executes requests through a provider adapter.
For the first version, the only adapter is Exa. However, the tool-facing layer remains provider-agnostic, so future providers only need to implement the shared provider interface and be added to config validation/registry wiring.
This is intentionally more structured than a single-file Exa wrapper because the user explicitly wants future extensibility without changing tool names or reworking the public API later.
File Structure
Extension package
-
Create:
/home/alex/dotfiles/.pi/agent/extensions/web-search/package.json- declares the extension package
- declares
exa-jsas a dependency - points pi at the extension entrypoint
-
Create:
/home/alex/dotfiles/.pi/agent/extensions/web-search/index.ts- extension entrypoint
- registers
web_searchandweb_fetch - wires together config loading, provider registry, tool handlers, and shared formatting
Shared schemas and config
-
Create:
/home/alex/dotfiles/.pi/agent/extensions/web-search/src/schema.ts- TypeBox schemas for tool parameters
- TypeBox schemas for
web-search.json - shared TypeScript types derived from the schemas where useful
-
Create:
/home/alex/dotfiles/.pi/agent/extensions/web-search/src/config.ts- reads
~/.pi/agent/web-search.json - validates config shape
- normalizes provider list into an internal map keyed by provider name
- resolves default provider
- reads
Provider abstraction
-
Create:
/home/alex/dotfiles/.pi/agent/extensions/web-search/src/providers/types.ts- generic request and response types for search/fetch
- provider interface used by the tool layer
- normalized internal result shapes independent of Exa SDK types
-
Create:
/home/alex/dotfiles/.pi/agent/extensions/web-search/src/providers/exa.ts- Exa-backed implementation of the provider interface
- translates generic search requests into Exa
search(...) - translates generic fetch requests into Exa
getContents(...) - isolates all Exa-specific request/response details
Tool handlers and formatting
-
Create:
/home/alex/dotfiles/.pi/agent/extensions/web-search/src/tools/web-search.tsweb_searchschema, execution logic, and tool rendering helpers
-
Create:
/home/alex/dotfiles/.pi/agent/extensions/web-search/src/tools/web-fetch.tsweb_fetchschema, execution logic, and tool rendering helpers
-
Create:
/home/alex/dotfiles/.pi/agent/extensions/web-search/src/format.ts- shared output shaping
- compact text summaries for the LLM
- truncation behavior for large results
- per-result formatting for batch fetches and partial failures
Config File Design
The extension will read exactly one file:
~/.pi/agent/web-search.json
Initial conceptual shape:
{
"defaultProvider": "exa-main",
"providers": [
{
"name": "exa-main",
"type": "exa",
"apiKey": "exa_...",
"options": {
"defaultSearchLimit": 5,
"defaultFetchTextMaxCharacters": 12000
}
}
]
}
Config rules
defaultProvidermust match one provider entry by name.providersmust be a non-empty array.- Each provider entry must include:
nametypeapiKey
apiKeyis a literal string in the first version.typeis validated so the runtime can select the correct adapter.- Exa-specific defaults may live under
options, but they must remain optional.
Config non-goals
The first version will not:
- read provider config from project-local files
- merge config from multiple files
- read credentials from env vars
- support shell-command-based credential resolution
- write or edit
web-search.jsonautomatically
If the file is missing or invalid, the tools should return a clear error telling the user where the file belongs and showing a minimal valid example.
Tool Contract
web_search
Purpose: search the web and return result metadata with a generic surface that can outlive Exa.
Conceptual input shape:
{
query: string;
limit?: number;
includeDomains?: string[];
excludeDomains?: string[];
startPublishedDate?: string;
endPublishedDate?: string;
category?: string;
provider?: string;
}
Default behavior
- returns metadata only
- does not fetch page text by default
- uses the default configured provider unless
providerexplicitly selects another configured provider
Result shape intent
Each search result should preserve a normalized subset of provider output such as:
titleurlpublishedDateauthorscore- provider-specific stable identifiers only if useful for follow-up operations
The tool’s text output should stay compact and easy for the model to scan.
web_fetch
Purpose: fetch contents for one or more URLs with a generic interface.
Conceptual input shape:
{
urls: string[];
text?: boolean;
highlights?: boolean;
summary?: boolean;
textMaxCharacters?: number;
provider?: string;
}
Input normalization
The canonical tool shape is urls: string[], where a single URL is represented as a one-element array. For robustness, the implementation may also accept a top-level url string through argument normalization and fold it into urls, but the stable contract exposed in schemas and docs should remain urls: string[].
Default behavior
- when no content mode is specified, fetch text
- batch requests are allowed
- the default configured provider is used unless overridden
Result shape intent
Each fetched item should preserve normalized per-URL results, including:
urltitlewhere availabletextby default- optional
highlights - optional
summary - per-item failure details for partial batch failures
Provider Abstraction
The provider interface should express the minimum shared behaviors needed by the tools:
interface WebSearchProvider {
type: string;
search(request: NormalizedSearchRequest): Promise<NormalizedSearchResponse>;
fetch(request: NormalizedFetchRequest): Promise<NormalizedFetchResponse>;
}
Exa adapter responsibilities
The Exa adapter will:
- instantiate an Exa client from the configured literal API key
- use Exa search without contents for
web_searchdefault behavior - use Exa
getContents(...)forweb_fetch - map Exa response fields into normalized provider-agnostic result types
- keep Exa-only fields contained inside the adapter unless they are intentionally promoted into the shared result model later
This keeps future provider additions focused: implement the same interface, extend config validation, and register the adapter.
Rendering and Output Design
The extension should provide compact tool rendering so calls and results are readable inside pi.
renderCall
web_search: show tool name and the queryweb_fetch: show tool name and URL count (or the single URL)
renderResult
web_search: show result count and a short numbered list of titles/URLsweb_fetch: show fetched count, failed count if any, and a concise per-URL summary
LLM-facing text output
The text returned to the model should be concise and predictable:
- search: compact metadata list only by default
- fetch: truncated text payloads with enough context to be useful
- batch fetch: clearly separated per-URL sections
Large outputs must be truncated with the shared truncation utilities pattern used by pi tool examples.
Error Handling
Expected runtime failures should be handled cleanly and descriptively.
Config errors
- missing
~/.pi/agent/web-search.json - invalid JSON
- schema mismatch
- empty provider list
- unknown
defaultProvider - unknown explicitly requested provider
- missing literal API key
These should return actionable errors naming the exact issue.
Input errors
- empty search query
- malformed URL(s)
- empty URL list after normalization
These should be rejected before any provider request is made.
Provider/runtime errors
- Exa authentication failures
- network failures
- rate limits
- unexpected response shapes
These should return a concise summary in tool content while preserving richer diagnostics in details.
Partial failures
For batch web_fetch, mixed outcomes should not fail the entire request unless every target fails. Successful pages should still be returned together with per-URL failure entries.
Testing Strategy
The design intentionally separates pure logic from pi wiring so most behavior can be tested without loading pi itself.
Automated tests
Cover:
- config parsing and normalization
- provider-list validation
- default-provider resolution
- generic request → Exa request mapping
- Exa response → normalized response mapping
- compact formatting for metadata-only search
- truncation for long fetch results
- batch fetch formatting with partial failures
- helpful error messages when config is absent or invalid
Test style
- prefer pure module tests for config, normalization, and formatting
- inject a fake Exa-like client into the Exa adapter instead of making live network calls
- keep extension entrypoint tests to smoke coverage only
Manual verification
After implementation:
- create
~/.pi/agent/web-search.json - reload pi
- run one
web_searchcall - run one single-URL
web_fetchcall - run one multi-URL
web_fetchcall - confirm missing/invalid config errors are readable
Non-Goals
The first version will not add:
- other providers besides Exa
- project-local web-search config
- automatic setup commands or interactive config editing
- provider-specific passthrough options in the public tool API
- rich snippet/highlight defaults for search
- live network integration tests in the normal automated suite
Acceptance Criteria
The work is complete when:
- pi discovers a new extension package at
.pi/agent/extensions/web-search/ - the agent has two generic tools:
web_searchweb_fetch
- the implementation uses an internal provider abstraction
- Exa is the first working provider implementation
- the runtime reads global config from
~/.pi/agent/web-search.json - config uses a provider-list shape with a default provider selector
- credentials are read as literal values from that file
web_searchreturns metadata only by defaultweb_fetchaccepts one or multiple URLs and returns text by default- missing config, invalid config, and provider failures return clean, actionable tool errors
- core mapping/formatting/config logic is covered by automated tests