162 lines
13 KiB
Markdown
162 lines
13 KiB
Markdown
# Chat & LLM Patterns
|
|
|
|
## Default AI Settings & Model Override
|
|
|
|
### DB-backed defaults (authoritative)
|
|
- **Model**: `UserAISettings` (OneToOneField, `integrations/models.py`) stores `preferred_provider` and `preferred_model` per user.
|
|
- **Endpoint**: `GET/POST /api/integrations/ai-settings/` — upsert pattern (OneToOneField + `perform_create` update-or-create).
|
|
- **Settings UI**: `settings/+page.svelte` loads/saves default provider and model. Provider dropdown filtered to configured providers; model dropdown from `GET /api/chat/providers/{provider}/models/`.
|
|
- **Chat initialization**: `AITravelChat.svelte` `loadUserAISettings()` fetches saved defaults on mount and applies them as authoritative initial provider/model. Direction is DB → localStorage (not reverse).
|
|
- **Backend fallback precedence** in `send_message()`:
|
|
1. Explicit request payload (`provider`, `model`)
|
|
2. `UserAISettings.preferred_provider` / `preferred_model` (only when provider matches)
|
|
3. Instance defaults (`VOYAGE_AI_PROVIDER`, `VOYAGE_AI_MODEL`)
|
|
4. `"openai"` hardcoded fallback
|
|
- **Cross-provider guard**: `preferred_model` only applied when resolved provider == `preferred_provider` (prevents e.g. `gpt-5-nano` leaking to Anthropic).
|
|
|
|
### Per-session model override (browser-only)
|
|
- **Frontend**: model dropdown next to provider selector, populated by `GET /api/chat/providers/{provider}/models/`.
|
|
- **Persistence**: `localStorage` key `voyage_chat_model_prefs` — written on selection, but never overrides DB defaults on initialization (DB wins).
|
|
- **Compatibility guard**: `_is_model_override_compatible()` validates model prefix for standard providers; skips check for `api_base` gateways (e.g. `opencode_zen`).
|
|
- **i18n keys**: `chat.model_label`, `chat.model_placeholder`, `default_ai_settings_title`, `default_ai_settings_desc`, `default_ai_save`, `default_ai_settings_saved`, `default_ai_settings_error`, `default_ai_provider_required`, `default_ai_no_providers`.
|
|
|
|
## Sanitized LLM Error Mapping
|
|
- `_safe_error_payload()` in `backend/server/chat/llm_client.py` maps LiteLLM exception classes to hardcoded user-safe strings with `error_category` field.
|
|
- Exception classes mapped: `NotFoundError` -> "model not found", `AuthenticationError` -> "authentication", `RateLimitError` -> "rate limit", `BadRequestError` -> "bad request", `Timeout` -> "timeout", `APIConnectionError` -> "connection".
|
|
- Raw `exc.message`, `str(exc)`, and `exc.args` are **never** forwarded to the client. Server-side `logger.exception()` logs full details.
|
|
- Uses `getattr(litellm.exceptions, "ClassName", tuple())` for resilient class lookup.
|
|
- Security guardrail from critic gate: [decisions.md](../../decisions.md#critic-gate-opencode-zen-connection-error-fix).
|
|
|
|
## Tool Call Error Handling (Chat Loop Hardening)
|
|
- **Required-arg detection**: `_is_required_param_tool_error()` matches tool results containing `"is required"` / `"are required"` patterns via regex. Detects errors like `"location is required"`, `"query is required"`, `"collection_id, name, latitude, and longitude are required"`.
|
|
- **Short-circuit on invalid tool calls**: When a tool call returns a required-param error, `send_message()` yields an SSE error event with `error_category: "tool_validation_error"` and immediately terminates the stream with `[DONE]`. No further LLM turns are attempted.
|
|
- **Persistence skip**: Invalid tool call results (and the tool_call entry itself) are NOT persisted to the database, preventing replay into future conversation turns.
|
|
- **Historical cleanup**: `_build_llm_messages()` filters persisted tool-role messages containing required-param errors AND trims the corresponding assistant `tool_calls` array to only IDs that have non-filtered tool messages. Empty `tool_calls` arrays are omitted entirely.
|
|
- **Multi-tool partial success**: When model returns N tool calls and call K fails, calls 1..K-1 (the successful prefix) are persisted normally. Only the failed call and subsequent calls are dropped.
|
|
- **Tool iteration guard**: `MAX_TOOL_ITERATIONS = 10` with correctly-incremented counter prevents unbounded loops from non-required-arg error classes that don't match the regex.
|
|
- **Resolved gap**: `get_weather` error was changed from `"dates must be a non-empty list"` to `"dates is required"` — now matches the regex and triggers the short-circuit. Resolved 2026-03-09.
|
|
|
|
## Trip Context UUID Grounding
|
|
- `send_message()` injects the active collection UUID into the system prompt `## Trip Context` section with explicit instruction: `"use this exact collection_id for get_trip_details and add_to_itinerary"`.
|
|
- UUID injection only occurs when collection lookup succeeds AND user is owner or `shared_with` member (authorization gate).
|
|
- System prompt includes two-phase confirmation guidance: confirm only before the **first** `add_to_itinerary` action; after explicit user approval phrases ("yes", "go ahead", "add them"), proceed directly without re-confirming.
|
|
- `get_trip_details` DoesNotExist returns `"collection_id is required and must reference a trip you can access"` (does NOT match short-circuit regex due to `fullmatch` — correct, this is an invalid-value error, not missing-param).
|
|
- Known pre-existing: `get_trip_details` filters `user=user` only — shared-collection members get UUID context but tool returns DoesNotExist. Low severity.
|
|
|
|
## Tool Output Rendering
|
|
- Frontend `AITravelChat.svelte` hides raw `role=tool` messages via `visibleMessages` filter (`messages.filter(msg => msg.role !== 'tool')`).
|
|
- Tool results render as concise user-facing summaries via `getToolSummary()`:
|
|
- `get_trip_details` → "Loaded details for {name} ({N} itinerary items)."
|
|
- `list_trips` → "Found {N} trip(s)."
|
|
- `add_to_itinerary` → "Added {name} to itinerary."
|
|
- `get_weather` → "Retrieved weather data."
|
|
- `search_places` / `web_search` → existing rich cards (place cards, linked cards).
|
|
- Error payloads → "{name} could not be completed." (no raw JSON).
|
|
- Unknown tools → generic fallback.
|
|
- **Reload reconstruction**: `rebuildConversationMessages()` scans persisted messages after conversation load, parses `role=tool` rows via `parseStoredToolResult()`, and attaches them as `tool_results` on the preceding assistant message (matched by `tool_call_id`). Both streaming and reload paths produce identical `tool_results` data.
|
|
- Text rendered via Svelte text interpolation (not `{@html}`), so LLM-sourced names are auto-escaped (no XSS vector).
|
|
|
|
## Embedded Chat UX
|
|
- Provider/model selectors moved into a compact `<details>` gear-icon dropdown in the header — header contains only hamburger toggle + title + settings gear.
|
|
- Embedded mode uses bounded height: `h-[65vh]` with `min-h-[30rem]` / `max-h-[46rem]`; softened card treatment (`bg-base-100` + border).
|
|
- Sidebar defaults to closed in embedded mode (`let sidebarOpen = !embedded;`); `lg:flex` ensures always-visible on desktop.
|
|
- Quick-action chips use `btn-xs` + `overflow-x-auto` for compact embedded fit.
|
|
- Streaming indicator visible inside last assistant bubble throughout entire generation (conditioned on `isStreaming && msg.id === lastVisibleMessageId`).
|
|
- Known low-priority: `aria-label` values on sidebar toggle and settings button are hardcoded English (should use `$t()`). `<details>` dropdown does not auto-close on outside click.
|
|
|
|
## OpenCode Zen Provider
|
|
- Provider ID: `opencode_zen`
|
|
- `api_base`: `https://opencode.ai/zen/v1`
|
|
- Default model: `openai/gpt-5-nano` (changed from `openai/gpt-4o-mini` which was invalid on Zen)
|
|
- GPT models on Zen use `/chat/completions` endpoint (OpenAI-compatible)
|
|
- LiteLLM `openai/` prefix routes through OpenAI client to the custom `api_base`
|
|
- Model dropdown exposes 5 curated options (reasoning models excluded). See [decisions.md](../../decisions.md#critic-gate-travel-agent-context--models-follow-up).
|
|
|
|
## Multi-Stop Context Derivation
|
|
Chat context derives from the **full collection itinerary**, not just the first location.
|
|
|
|
### Frontend - `deriveCollectionDestination()`
|
|
- Located in `frontend/src/routes/collections/[id]/+page.svelte`.
|
|
- Extracts unique city/country pairs from `collection.locations`.
|
|
- Capped at 4 stops, semicolon-joined, with `+N more` overflow suffix.
|
|
- Passed to `AITravelChat` as `destination` prop.
|
|
|
|
### Backend - `send_message()` itinerary enrichment
|
|
- `backend/server/chat/views/__init__.py` `send_message()` reads `collection.locations` and injects `Itinerary stops:` into the system prompt `## Trip Context` section.
|
|
- Up to 8 unique stops; deduplication and blank-entry filtering applied.
|
|
|
|
### System prompt - trip-level reasoning
|
|
- `get_system_prompt()` includes guidance to treat collection chats as itinerary-wide and call `get_trip_details` before `search_places`.
|
|
|
|
## Itinerary-Centric Quick Prompts
|
|
- Quick-action buttons use `promptTripContext` (reactive: `collectionName || destination || ''`) instead of raw `destination`.
|
|
- Guard changed from `{#if destination}` to `{#if promptTripContext}`.
|
|
- Prompt wording uses `across my ${promptTripContext} itinerary?`.
|
|
|
|
## search_places Tool Output Key Convention
|
|
- Backend `agent_tools.py` `search_places()` returns `{"location": ..., "category": ..., "results": [...]}`.
|
|
- Frontend must use `.results` key (not `.places`).
|
|
- **Historical bug**: Prior code used `.places` causing place cards to never render. Fixed 2026-03-09.
|
|
|
|
## Agent Tools Architecture
|
|
|
|
### Registered Tools
|
|
| Tool name | Purpose | Required params |
|
|
|---|---|---|
|
|
| `search_places` | Nominatim geocode -> Overpass PoI search | `location` |
|
|
| `web_search` | DuckDuckGo web search for current travel info | `query` |
|
|
| `list_trips` | List user's collections | (none) |
|
|
| `get_trip_details` | Full collection detail with itinerary | `collection_id` |
|
|
| `add_to_itinerary` | Create Location + CollectionItineraryItem | `collection_id`, `name`, `lat`, `lon` |
|
|
| `get_weather` | Open-Meteo archive + forecast | `latitude`, `longitude`, `dates` |
|
|
|
|
### Registry pattern
|
|
- `@agent_tool(name, description, parameters)` decorator registers function references and generates OpenAI/LiteLLM-compatible tool schemas.
|
|
- `execute_tool(tool_name, user, **kwargs)` resolves from registry and filters kwargs via `inspect.signature(...)`.
|
|
- Extensibility: adding a new tool only requires defining a decorated function.
|
|
|
|
### Function signature convention
|
|
All tool functions: `def tool_name(user, **kwargs) -> dict`. Return `{"error": "..."}` on failure; never raise.
|
|
|
|
### Web Search Tool
|
|
- Uses `duckduckgo_search.DDGS().text(..., max_results=5)`.
|
|
- Error handling includes import fallback, rate-limit guard, and generic failure logging.
|
|
- Dependency: `duckduckgo-search>=4.0.0` in `backend/server/requirements.txt`.
|
|
|
|
## Backend Chat Endpoint Architecture
|
|
|
|
### URL Routing
|
|
- `backend/server/main/urls.py`: `path("api/chat/", include("chat.urls"))`
|
|
- `backend/server/chat/urls.py`: DRF `DefaultRouter` registers `conversations/` -> `ChatViewSet`, `providers/` -> `ChatProviderCatalogViewSet`
|
|
- Manual paths: `POST /api/chat/suggestions/day/` -> `DaySuggestionsView`, `GET /api/chat/capabilities/` -> `CapabilitiesView`
|
|
|
|
### ChatViewSet Pattern
|
|
- All actions: `permission_classes = [IsAuthenticated]`
|
|
- Streaming response uses `StreamingHttpResponse(content_type="text/event-stream")`
|
|
- SSE chunk format: `data: {json}\n\n`; terminal `data: [DONE]\n\n`
|
|
- Tool loop: up to `MAX_TOOL_ITERATIONS = 10` rounds
|
|
|
|
### Day Suggestions Endpoint
|
|
- `POST /api/chat/suggestions/day/` via `chat/views/day_suggestions.py`
|
|
- Non-streaming JSON response
|
|
- Inputs: `collection_id`, `date`, `category`, `filters`, `location_context`
|
|
- Provider/model resolution via `_resolve_provider_and_model()`: request payload → `UserAISettings` defaults → instance defaults (`VOYAGE_AI_PROVIDER`/`VOYAGE_AI_MODEL`) → provider config default. No hardcoded OpenAI fallback.
|
|
- Cross-provider model guard: `preferred_model` only applied when provider matches `preferred_provider`.
|
|
- LLM call via `litellm.completion` with regex JSON extraction fallback
|
|
- Suggestion normalization: frontend `normalizeSuggestionItem()` handles LLM response variants (title/place_name/venue, summary/details, address/neighborhood). Items without resolvable name are dropped.
|
|
- Add-to-itinerary: `buildLocationPayload()` constructs `LocationSerializer`-compatible payload (name/location/description/rating/collections/is_public) from normalized suggestion.
|
|
|
|
### Capabilities Endpoint
|
|
- `GET /api/chat/capabilities/` returns `{ "tools": [{ "name", "description" }, ...] }` from registry
|
|
|
|
## WS4-F4 Chat UI Rendering
|
|
- Travel-themed header (icon: airplane, title: `Travel Assistant` with optional collection name suffix)
|
|
- `ChatMessage` type supports `tool_results?: Array<{ name, result }>` for inline tool output
|
|
- SSE handling appends to current assistant message's `tool_results` array
|
|
- Renderer: `search_places` -> place cards, `web_search` -> linked cards, fallback -> JSON `<pre>`
|
|
|
|
## WS4-F3 Add-to-itinerary from Chat
|
|
- `search_places` card results can be added directly to itinerary when collection context exists
|
|
- Flow: date selector modal -> `POST /api/locations/` -> `POST /api/itineraries/` -> `itemAdded` event
|
|
- Coordinate guard (`hasPlaceCoordinates`) required
|