BREAKING CHANGE: remove Tavily, Firecrawl, provider fallback, and web-search-config. web_search and web_fetch now use Exa-shaped inputs and return raw Exa-style details.
148 lines
4.0 KiB
Markdown
148 lines
4.0 KiB
Markdown
# Exa-only rewrite for `pi-web-search`
|
||
|
||
- Status: approved design
|
||
- Date: 2026-04-12
|
||
- Project: `pi-web-search`
|
||
- Supersedes: `2026-04-12-firecrawl-design.md`
|
||
|
||
## Summary
|
||
Rewrite `pi-web-search` as an Exa-only package. Remove Tavily, Firecrawl, provider failover, and the interactive config command. Keep the two public tools, but make them Exa-shaped instead of provider-generic.
|
||
|
||
## Approved product decisions
|
||
- Keep only `web_search` and `web_fetch`.
|
||
- Support Exa’s non-streaming `search` and `getContents` functionality.
|
||
- Use a single Exa config instead of a provider list.
|
||
- Remove `web-search-config`.
|
||
- Return tool `details` close to raw Exa responses.
|
||
- Delete Tavily and Firecrawl code, tests, docs, and config paths completely.
|
||
|
||
## Goals
|
||
1. Make the package Exa-only.
|
||
2. Expose Exa-native request shapes for both tools.
|
||
3. Keep human-readable output compact while preserving raw Exa details.
|
||
4. Support config through `~/.pi/agent/web-search.json` and `EXA_API_KEY`.
|
||
5. Remove stale multi-provider abstractions and tests.
|
||
|
||
## Non-goals
|
||
- Expose Exa streaming APIs in this change.
|
||
- Expose Exa `answer`, `findSimilar`, research, monitors, websets, imports, or webhook APIs.
|
||
- Preserve the old provider-generic request contract.
|
||
- Preserve the interactive config command.
|
||
|
||
## Public tool contract
|
||
### `web_search`
|
||
Map directly to `exa.search(query, options)`.
|
||
|
||
Supported top-level fields include:
|
||
- `query`
|
||
- `type`
|
||
- `numResults`
|
||
- `includeDomains`
|
||
- `excludeDomains`
|
||
- `startCrawlDate`
|
||
- `endCrawlDate`
|
||
- `startPublishedDate`
|
||
- `endPublishedDate`
|
||
- `category`
|
||
- `includeText`
|
||
- `excludeText`
|
||
- `flags`
|
||
- `userLocation`
|
||
- `moderation`
|
||
- `useAutoprompt`
|
||
- `systemPrompt`
|
||
- `outputSchema`
|
||
- `additionalQueries`
|
||
- `contents`
|
||
|
||
Behavior notes:
|
||
- Exa search returns text contents by default when `contents` is omitted.
|
||
- `contents: false` is the metadata-only mode.
|
||
- `additionalQueries` is allowed only for deep search types.
|
||
- `includeText` and `excludeText` accept at most one phrase of up to 5 words.
|
||
|
||
### `web_fetch`
|
||
Map directly to `exa.getContents(urls, options)`.
|
||
|
||
Supported fields include:
|
||
- `urls`
|
||
- `text`
|
||
- `highlights`
|
||
- `summary`
|
||
- `context`
|
||
- `livecrawl`
|
||
- `livecrawlTimeout`
|
||
- `maxAgeHours`
|
||
- `filterEmptyResults`
|
||
- `subpages`
|
||
- `subpageTarget`
|
||
- `extras`
|
||
|
||
Behavior notes:
|
||
- No provider selection.
|
||
- No generic fallback behavior.
|
||
- No package-invented `textMaxCharacters`; use Exa `text.maxCharacters`.
|
||
|
||
## Config model
|
||
Use a single config object:
|
||
|
||
```json
|
||
{
|
||
"apiKey": "exa_...",
|
||
"baseUrl": "https://api.exa.ai"
|
||
}
|
||
```
|
||
|
||
Rules:
|
||
- `apiKey` is required unless `EXA_API_KEY` is set.
|
||
- `baseUrl` is optional.
|
||
- Legacy multi-provider configs should fail with a migration hint.
|
||
- Missing config file is allowed when `EXA_API_KEY` is present.
|
||
|
||
## Runtime design
|
||
Keep runtime small:
|
||
1. load Exa config
|
||
2. create Exa client
|
||
3. delegate to `search` or `getContents`
|
||
4. return raw Exa response
|
||
|
||
Remove:
|
||
- provider registry
|
||
- provider capabilities
|
||
- fallback graph execution
|
||
- execution attempt metadata
|
||
|
||
## Formatting
|
||
- Human-readable output should say `via Exa`.
|
||
- Tool `details` should stay close to raw Exa responses.
|
||
- Search output should show `output.content` when present.
|
||
- Fetch/search text should still be truncated in package formatting for readability.
|
||
|
||
## Files expected to change
|
||
- `index.ts`
|
||
- `src/config.ts`
|
||
- `src/schema.ts`
|
||
- `src/runtime.ts`
|
||
- `src/providers/exa.ts`
|
||
- `src/tools/web-search.ts`
|
||
- `src/tools/web-fetch.ts`
|
||
- `src/format.ts`
|
||
- `README.md`
|
||
- tests under `src/`
|
||
- package metadata and agent docs
|
||
|
||
## Testing strategy
|
||
1. Config tests for single Exa config, env fallback, invalid `baseUrl`, and legacy-config rejection.
|
||
2. Exa adapter tests for option pass-through and client construction.
|
||
3. Runtime tests for raw Exa delegation.
|
||
4. Tool tests for Exa-shaped normalization and validation.
|
||
5. Formatting tests for compact Exa output.
|
||
6. Manifest/README tests for Exa-only packaging.
|
||
|
||
## Acceptance criteria
|
||
- No Tavily or Firecrawl runtime/config/tool paths remain.
|
||
- `web_search` and `web_fetch` are Exa-shaped.
|
||
- `web-search-config` is removed.
|
||
- Config supports file or `EXA_API_KEY`.
|
||
- Tests pass.
|