feat!: make web search Exa-only

BREAKING CHANGE: remove Tavily, Firecrawl, provider fallback,
and web-search-config. web_search and web_fetch now use
Exa-shaped inputs and return raw Exa-style details.
This commit is contained in:
pi
2026-04-12 11:11:36 +01:00
parent cfd1558522
commit 37b24694a0
31 changed files with 1101 additions and 3436 deletions

139
README.md
View File

@@ -1,6 +1,6 @@
# pi-web-search
`pi-web-search` is a Pi extension package that adds `web_search` and `web_fetch` tools backed by pluggable providers such as Exa, Tavily, and Firecrawl.
`pi-web-search` is a Pi extension package that adds Exa-backed `web_search` and `web_fetch` tools.
## Install
@@ -22,76 +22,89 @@ pi install https://gitea.rwiesner.com/pi/pi-web-search
## Configuration
Provider configuration is managed by the extension's own commands and config files.
Example `~/.pi/agent/web-search.json`:
Set `EXA_API_KEY`, or create `~/.pi/agent/web-search.json`:
```json
{
"defaultProvider": "firecrawl-main",
"providers": [
{
"name": "firecrawl-main",
"type": "firecrawl",
"apiKey": "fc-...",
"fallbackProviders": ["exa-fallback"]
},
{
"name": "exa-fallback",
"type": "exa",
"apiKey": "exa_..."
}
]
}
```
Self-hosted Firecrawl:
```json
{
"defaultProvider": "firecrawl-selfhosted",
"providers": [
{
"name": "firecrawl-selfhosted",
"type": "firecrawl",
"baseUrl": "https://firecrawl.internal.example/v2"
}
]
}
```
Tool examples:
```json
{
"query": "pi docs",
"provider": "firecrawl-main",
"firecrawl": {
"country": "DE",
"categories": ["github"],
"scrapeOptions": {
"formats": ["markdown"]
}
}
}
```
```json
{
"urls": ["https://pi.dev"],
"provider": "firecrawl-main",
"summary": true,
"firecrawl": {
"formats": ["markdown", "summary", "images"]
}
"apiKey": "exa_...",
"baseUrl": "https://api.exa.ai"
}
```
Notes:
- Firecrawl self-hosted providers may omit `apiKey` when `baseUrl` is set.
- Firecrawl does not support generic `highlights`; use Firecrawl `formats` such as `markdown`, `summary`, and `images` instead.
- `apiKey` is required unless `EXA_API_KEY` is set.
- `baseUrl` is optional.
- Older multi-provider configs are no longer supported.
Run `web-search-config` inside Pi to add or edit Tavily, Exa, and Firecrawl providers interactively.
## Tool behavior
### `web_search`
Maps directly to Exa `search(query, options)`.
Notes:
- Exa search returns text contents by default.
- Pass `contents: false` for metadata-only search results.
- `additionalQueries` is only valid for deep search types: `deep-lite`, `deep`, `deep-reasoning`.
- `includeText` and `excludeText` currently support at most one phrase of up to 5 words.
Example:
```json
{
"query": "Who leads OpenAI's safety team?",
"type": "deep",
"numResults": 5,
"systemPrompt": "Prefer official docs",
"outputSchema": {
"type": "text",
"description": "Answer in short bullets"
},
"contents": {
"highlights": {
"query": "OpenAI safety lead",
"maxCharacters": 300
},
"summary": true
}
}
```
Metadata-only search:
```json
{
"query": "pi docs",
"contents": false,
"includeDomains": ["pi.dev"]
}
```
### `web_fetch`
Maps directly to Exa `getContents(urls, options)`.
Example:
```json
{
"urls": ["https://pi.dev"],
"text": {
"maxCharacters": 4000,
"verbosity": "standard"
},
"highlights": {
"query": "tooling",
"maxCharacters": 300
},
"summary": true,
"livecrawl": "preferred",
"extras": {
"links": 20,
"imageLinks": 10
}
}
```
## Development