fix(chat): add saved AI defaults and harden suggestions

This commit is contained in:
2026-03-09 20:32:13 +00:00
parent 21954df3ee
commit bb54503235
38 changed files with 3949 additions and 105 deletions

0
.memory/plans/.gitkeep Normal file
View File

View File

@@ -0,0 +1,108 @@
# Plan: AI travel agent in Collections Recommendations
## Clarified requirements
- Move AI travel agent UX from standalone `/chat` tab/page into Collections → Recommendations.
- Remove the existing `/chat` route (not keep/redirect).
- Provider list should be dynamic and display all providers LiteLLM supports.
- Ensure OpenCode Zen is supported as a provider.
## Execution prerequisites
- In each worktree, run `cd frontend && npm install` before implementation to ensure node modules (including `@mdi/js`) are present and baseline build can run.
## Decomposition (approved by user)
### Workstream 1 — Collections recommendations chat integration (Frontend + route cleanup)
- **Worktree**: `.worktrees/collections-ai-agent`
- **Branch**: `feat/collections-ai-agent`
- **Risk**: Medium
- **Quality tier**: Tier 2
- **Task WS1-F1**: Embed AI chat experience inside Collections Recommendations UI.
- **Acceptance criteria**:
- Chat UI is available from Collections Recommendations section.
- Existing recommendations functionality remains usable.
- Chat interactions continue to work with existing backend chat APIs.
- **Task WS1-F2**: Remove standalone `/chat` route/page.
- **Acceptance criteria**:
- `/chat` page is removed from app routes/navigation.
- No broken imports/navigation links remain.
### Workstream 2 — Provider catalog + Zen provider support (Backend + frontend settings/chat)
- **Worktree**: `.worktrees/litellm-provider-catalog`
- **Branch**: `feat/litellm-provider-catalog`
- **Risk**: Medium
- **Quality tier**: Tier 2 (promote to Tier 1 if auth/secret handling changes)
- **Task WS2-F1**: Implement dynamic provider listing based on LiteLLM-supported providers.
- **Acceptance criteria**:
- Backend exposes `GET /api/chat/providers/` using LiteLLM runtime provider list as source data.
- Frontend provider selectors consume backend provider catalog rather than hardcoded arrays.
- UI displays all LiteLLM provider IDs and metadata; non-chat-compatible providers are labeled unavailable.
- Existing saved provider/API-key flows still function.
- **Task WS2-F2**: Add/confirm OpenCode Zen provider support end-to-end.
- **Acceptance criteria**:
- OpenCode Zen appears as provider id `opencode_zen`.
- Backend model resolution and API-key lookup work for `opencode_zen`.
- Zen calls use LiteLLM OpenAI-compatible routing with `api_base=https://opencode.ai/zen/v1`.
- Chat requests using Zen provider are accepted without fallback/validation failures.
## Provider architecture decision
- Backend provider catalog endpoint `GET /api/chat/providers/` is the single source of truth for UI provider options.
- Endpoint response fields: `id`, `label`, `available_for_chat`, `needs_api_key`, `default_model`, `api_base`.
- All LiteLLM runtime providers are returned; entries without model mapping are `available_for_chat=false`.
- Chat send path only accepts providers where `available_for_chat=true`.
## Research findings (2026-03-08)
- LiteLLM provider enumeration is available at runtime (`litellm.provider_list`), currently 128 providers in this environment.
- OpenCode Zen is not a native LiteLLM provider alias; support should be implemented via OpenAI-compatible provider config and explicit `api_base`.
- Existing hardcoded provider duplication (backend + chat page + settings page) will be replaced by backend catalog consumption.
- Reference: [LiteLLM + Zen provider research](../research/litellm-zen-provider-catalog.md)
## Dependencies
- WS1 depends on existing chat API endpoint behavior and event streaming contract.
- WS2 depends on LiteLLM provider metadata/query capabilities and provider-catalog endpoint design.
- WS1-F1 depends on WS2 completion for dynamic provider selector integration.
- WS1-F2 depends on WS1-F1 completion.
## Human checkpoints
- No checkpoint required: Zen support path uses existing LiteLLM dependency via OpenAI-compatible API (no new SDK/service).
## Findings tracker
- WS1-F1 implemented in worktree `.worktrees/collections-ai-agent`:
- Extracted chat route UI into reusable component `frontend/src/lib/components/AITravelChat.svelte`, preserving conversation list, message stream rendering, provider selector, conversation CRUD, and SSE send-message flow via `/api/chat/conversations/*`.
- Updated `frontend/src/routes/chat/+page.svelte` to render the reusable component so existing `/chat` behavior remains intact for WS1-F1 scope (WS1-F2 route removal deferred).
- Embedded `AITravelChat` into Collections Recommendations view in `frontend/src/routes/collections/[id]/+page.svelte` above `CollectionRecommendationView`, keeping existing recommendation search/map/create flows unchanged.
- Reviewer warning resolved: removed redundant outer card wrapper around `AITravelChat` in Collections Recommendations embedding, eliminating nested card-in-card styling while preserving spacing and recommendations placement.
- WS1-F2 implemented in worktree `.worktrees/collections-ai-agent`:
- Removed standalone chat route page by deleting `frontend/src/routes/chat/+page.svelte`.
- Removed `/chat` navigation item from `frontend/src/lib/components/Navbar.svelte`, including the now-unused `mdiRobotOutline` icon import.
- Verified embedded chat remains in Collections Recommendations via `AITravelChat` usage in `frontend/src/routes/collections/[id]/+page.svelte`; no remaining `/chat` route links/imports in `frontend/src`.
- WS2-F1 implemented in worktree `.worktrees/litellm-provider-catalog`:
- Added backend provider catalog endpoint `GET /api/chat/providers/` from `litellm.provider_list` with response fields `id`, `label`, `available_for_chat`, `needs_api_key`, `default_model`, `api_base`.
- Refactored chat provider model map into `CHAT_PROVIDER_CONFIG` in `backend/server/chat/llm_client.py` and reused it for both send-message routing and provider catalog metadata.
- Updated chat/settings frontend provider consumers to fetch provider catalog dynamically and removed hardcoded provider arrays.
- Chat UI now restricts provider selection/sending to `available_for_chat=true`; settings API key UI now lists full provider catalog (including unavailable-for-chat entries).
- WS2-F1 reviewer carry-forward fixes applied:
- Fixed chat provider selection fallback timing in `frontend/src/routes/chat/+page.svelte` by computing `availableProviders` from local `catalog` response data instead of relying on reactive `chatProviders` immediately after assignment.
- Applied low-risk settings improvement in `frontend/src/routes/settings/+page.svelte` by changing `await loadProviderCatalog()` to `void loadProviderCatalog()` in the second `onMount`, preventing provider fetch from delaying success toast logic.
- WS2-F2 implemented in worktree `.worktrees/litellm-provider-catalog`:
- Added `opencode_zen` to `CHAT_PROVIDER_CONFIG` in `backend/server/chat/llm_client.py` with label `OpenCode Zen`, `needs_api_key=true`, `default_model=openai/gpt-4o-mini`, and `api_base=https://opencode.ai/zen/v1`.
- Updated `get_provider_catalog()` to append configured chat providers not present in `litellm.provider_list`, ensuring OpenCode Zen appears in `GET /api/chat/providers/` even though it is an OpenAI-compatible alias rather than a native LiteLLM provider id.
- Normalized provider IDs in `get_llm_api_key()` and `stream_chat_completion()` via `_normalize_provider_id()` to keep API-key lookup and LLM request routing consistent for `opencode_zen`.
- Consolidation completed in worktree `.worktrees/collections-ai-agent`:
- Ported WS2 provider-catalog backend to `backend/server/chat` in the collections branch, including `GET /api/chat/providers/`, `CHAT_PROVIDER_CONFIG` metadata fields (`label`, `needs_api_key`, `default_model`, `api_base`), and chat-send validation to allow only `available_for_chat` providers.
- Confirmed `opencode_zen` support in consolidated branch with `label=OpenCode Zen`, `default_model=openai/gpt-4o-mini`, `api_base=https://opencode.ai/zen/v1`, and API-key-required behavior.
- Replaced hardcoded providers in `frontend/src/lib/components/AITravelChat.svelte` with dynamic `/api/chat/providers/` loading, preserving send guard to chat-available providers only.
- Updated settings API-key provider dropdown in `frontend/src/routes/settings/+page.svelte` to load full provider catalog dynamically and added `ChatProviderCatalogEntry` type in `frontend/src/lib/types.ts`.
- Preserved existing collections chat embedding and kept standalone `/chat` route removed (no route reintroduction in consolidation changes).
## Retry tracker
- WS1-F1: 0
- WS1-F2: 0
- WS2-F1: 0
- WS2-F2: 0
## Execution checklist
- [x] WS2-F1 Dynamic provider listing from LiteLLM (Tier 2)
- [x] WS2-F2 OpenCode Zen provider support (Tier 2)
- [x] WS1-F1 Embed AI chat into Collections Recommendations (Tier 2)
- [x] WS1-F2 Remove standalone `/chat` route (Tier 2)
- [x] Documentation coverage + knowledge sync (Librarian)

View File

@@ -0,0 +1,338 @@
# AI Travel Agent Redesign Plan
## Vision Summary
Redesign the AI travel agent with two context-aware entry points, user preference learning, flexible provider configuration, extensibility for future integrations, web search capability, and multi-user collection support.
---
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────────┐
│ ENTRY POINTS │
├─────────────────────────────┬───────────────────────────────────┤
│ Day-Level Suggestions │ Collection-Level Chat │
│ (new modal) │ (improved Recommendations tab) │
│ - Category filters │ - Context-aware │
│ - Sub-filters │ - Add to itinerary actions │
│ - Add to day action │ │
└─────────────────────────────┴───────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ AGENT CORE │
├─────────────────────────────────────────────────────────────────┤
│ - LiteLLM backend (streaming SSE) │
│ - Tool calling (place search, web search, itinerary actions) │
│ - Multi-user preference aggregation │
│ - Context injection (collection, dates, location) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ CONFIGURATION LAYERS │
├─────────────────────────────────────────────────────────────────┤
│ Instance (.env) → VOYAGE_AI_PROVIDER │
│ → VOYAGE_AI_MODEL │
│ → VOYAGE_AI_API_KEY │
│ User (DB) → UserAPIKey.per-provider keys │
│ → UserAISettings.model preference │
│ Fallback: User key → Instance key → Error │
└─────────────────────────────────────────────────────────────────┘
```
---
## Workstreams
### WS1: Configuration Infrastructure
**Goal**: Support both instance-level and user-level provider/model configuration with proper fallback.
#### WS1-F1: Instance-level configuration
- Add env vars to `settings.py`:
- `VOYAGE_AI_PROVIDER` (default: `openai`)
- `VOYAGE_AI_MODEL` (default: `gpt-4o-mini`)
- `VOYAGE_AI_API_KEY` (optional global key)
- Update `llm_client.py` to read instance defaults
- Add fallback chain: user key → instance key → error
#### WS1-F2: User-level model preferences
- Add `UserAISettings` model (OneToOne → CustomUser):
- `preferred_provider` (CharField)
- `preferred_model` (CharField)
- Create API endpoint: `POST /api/ai/settings/`
- Add UI in Settings → AI section for model selection
#### WS1-F3: Provider catalog enhancement
- Extend provider catalog response to include:
- `instance_configured`: bool (has instance key)
- `user_configured`: bool (has user key)
- Update frontend to show configuration status per provider
**Files**: `settings.py`, `llm_client.py`, `integrations/models.py`, `integrations/views/`, `frontend/src/routes/settings/`
---
### WS2: User Preference Learning
**Goal**: Capture and use user preferences in AI recommendations.
#### WS2-F1: Preference UI
- Add "AI Preferences" tab to Settings page
- Form fields: cuisines, interests, trip_style, notes
- Use tag input for cuisines/interests (better UX than free text)
- Connect to existing `/api/integrations/recommendation-preferences/`
Implementation notes (2026-03-08):
- Implemented in `frontend/src/routes/settings/+page.svelte` as `travel_preferences` section in the existing settings sidebar, with `savePreferences(event)` posting to `/api/integrations/recommendation-preferences/`.
- `interests` conversion is string↔array at UI boundary: load via `(profile.interests || []).join(', ')`; save via `.split(',').map((s) => s.trim()).filter(Boolean)`.
- SSR preload added in `frontend/src/routes/settings/+page.server.ts` using parallel fetch with API keys; returns `props.recommendationProfile` as first list element or `null`.
- Frontend typing added in `frontend/src/lib/types.ts` (`UserRecommendationPreferenceProfile`) and i18n strings added under `settings` in `frontend/src/locales/en.json`.
- See backend capability reference in [Project Knowledge — User Recommendation Preference Profile](../knowledge.md#user-recommendation-preference-profile).
#### WS2-F2: Preference injection
- Enhance `get_system_prompt()` to format preferences better
- Add preference summary in system prompt (structured, not just appended)
#### WS2-F3: Multi-user aggregation
- New function: `get_aggregated_preferences(collection)`
- Returns combined preferences from all `collection.shared_with` users + owner
- Format: "Party preferences: User A likes X, User B prefers Y..."
- Inject into system prompt for shared collections
**Files**: `frontend/src/routes/settings/`, `chat/llm_client.py`, `integrations/models.py`
---
### WS3: Day-Level Suggestions Modal
**Goal**: Add "Suggest" option to itinerary day "Add" dropdown with category filters.
#### WS3-F1: Suggestion modal component
- Create `ItinerarySuggestionModal.svelte`
- Two-step flow:
1. **Category selection**: Restaurant, Activity, Event, Lodging
2. **Filter refinement**:
- Restaurant: cuisine type, price range, dietary restrictions
- Activity: type (outdoor, cultural, etc.), duration
- Event: type, date/time preference
- Lodging: type, amenities
- "Any/Surprise me" option for each filter
#### WS3-F2: Add button integration
- Add "Get AI suggestions" option to `CollectionItineraryPlanner.svelte` Add dropdown
- Opens suggestion modal with target date pre-set
- Modal receives: `collectionId`, `targetDate`, `collectionLocation` (for context)
#### WS3-F3: Suggestion results display
- Show 3-5 suggestions as cards with:
- Name, description, why it fits preferences
- "Add to this day" button
- "Add to different day" option
- On add: **direct REST API call** to `/api/itineraries/` (not agent tool)
- User must approve each item individually - no bulk/auto-add
- Close modal and refresh itinerary on success
#### WS3-F4: Backend suggestion endpoint
- New endpoint: `POST /api/ai/suggestions/day/`
- Params: `collection_id`, `date`, `category`, `filters`, `location_context`
- Returns structured suggestions (not chat, direct JSON)
- Uses agent internally but returns parsed results
**Files**: `CollectionItineraryPlanner.svelte`, `ItinerarySuggestionModal.svelte` (new), `chat/views.py`, `chat/agent_tools.py`
---
### WS3.5: Insertion Flow Clarification
**Two insertion paths exist:**
| Path | Entry Point | Mechanism | Use Case |
|------|-------------|-----------|----------|
| **User-approved** | Suggestions modal | Direct REST API call to `/api/itineraries/` | Day-level suggestions, user reviews and clicks Add |
| **Agent-initiated** | Chat (Recommendations tab) | `add_to_itinerary` tool via SSE streaming | Conversational adds when user says "add that place" |
**Why two paths:**
- Modal: Faster, simpler UX - no round-trip through agent, user stays in control
- Chat: Natural conversation flow - agent can add as part of dialogue
**No changes needed to agent tools** - `add_to_itinerary` already exists in `agent_tools.py` and works for chat-initiated adds.
---
### WS4: Collection-Level Chat Improvements
**Goal**: Make Recommendations tab chat context-aware and action-capable.
#### WS4-F1: Context injection
- Pass collection context to `AITravelChat.svelte`:
- `collectionId`, `collectionName`, `startDate`, `endDate`
- `destination` (from collection locations or user input)
- Inject into system prompt: "You are helping plan a trip to X from Y to Z"
Implementation notes (2026-03-08):
- `frontend/src/lib/components/AITravelChat.svelte` now exposes optional context props (`collectionId`, `collectionName`, `startDate`, `endDate`, `destination`) and includes them in `POST /api/chat/conversations/{id}/send_message/` payload.
- `frontend/src/routes/collections/[id]/+page.svelte` now passes collection context into `AITravelChat`; destination is derived via `deriveCollectionDestination(...)` from `city/country/location/name` on the first usable location.
- `backend/server/chat/views/__init__.py::ChatViewSet.send_message()` now accepts the same optional fields, resolves `collection_id` (owner/shared access only), and appends a `## Trip Context` block to the system prompt before streaming.
- Related architecture note: [Project Knowledge — AI Chat](../knowledge.md#ai-chat-collections--recommendations).
#### WS4-F2: Quick action buttons
- Add preset prompts above chat input:
- "Suggest restaurants for this trip"
- "Find activities near [destination]"
- "What should I pack for [dates]?"
- Pre-fill input on click
#### WS4-F3: Add-to-itinerary from chat
- When agent suggests a place, show "Add to itinerary" button
- User selects date → calls `add_to_itinerary` tool
- Visual feedback on success
Implementation notes (2026-03-08):
- Implemented in `frontend/src/lib/components/AITravelChat.svelte` as an MVP direct frontend flow (no agent round-trip):
- Adds `Add to Itinerary` button to `search_places` result cards when `collectionId` exists.
- Opens a date picker modal (`showDateSelector`, `selectedPlaceToAdd`, `selectedDate`) constrained by trip date range (`min={startDate}`, `max={endDate}`).
- On confirm, creates a location via `POST /api/locations/` then creates itinerary entry via `POST /api/itineraries/`.
- Dispatches `itemAdded { locationId, date }` and shows success toast (`added_successfully`).
- Guards against missing/invalid coordinates by disabling add action unless lat/lon parse successfully.
- i18n keys added in `frontend/src/locales/en.json`: `add_to_itinerary`, `add_to_which_day`, `added_successfully`.
#### WS4-F4: Improved UI
- Remove generic "robot" branding, use travel-themed design
- Show collection name in header
- Better tool result display (cards instead of raw JSON)
Implementation notes (2026-03-08):
- `frontend/src/lib/components/AITravelChat.svelte` header now uses travel branding with `✈️` and renders `Travel Assistant · {collectionName}` when collection context is present; destination is shown as a subtitle when provided.
- Robot icon usage in chat UI was replaced with travel-themed emoji (`✈️`, `🌍`, `🗺️`) while keeping existing layout structure.
- SSE `tool_result` chunks are now attached to the in-flight assistant message via `tool_results` and rendered inline as structured cards for `search_places` and `web_search`, with JSON `<pre>` fallback for unknown tools.
- Legacy persisted `role: 'tool'` messages are still supported via JSON parsing fallback and use the same card rendering logic.
- i18n root keys added in `frontend/src/locales/en.json`: `travel_assistant`, `quick_actions`.
See [Project Knowledge — WS4-F4 Chat UI Rendering](../knowledge.md#ws4-f4-chat-ui-rendering).
**Files**: `AITravelChat.svelte`, `chat/views.py`, `chat/llm_client.py`
---
### WS5: Web Search Capability
**Goal**: Enable agent to search the web for current information.
#### WS5-F1: Web search tool
- Add `web_search` tool to `agent_tools.py`:
- Uses DuckDuckGo (free, no API key) or Brave Search API (env var)
- Returns top 5 results with titles, snippets, URLs
- Tool schema:
```python
{
"name": "web_search",
"description": "Search the web for current information about destinations, events, prices, etc.",
"parameters": {
"query": "string - search query",
"location_context": "string - optional location to bias results"
}
}
```
#### WS5-F2: Tool integration
- Register in `AGENT_TOOLS` list
- Add to `execute_tool()` dispatcher
- Handle rate limiting gracefully
**Files**: `chat/agent_tools.py`, `requirements.txt` (add `duckduckgo-search`)
---
### WS6: Extensibility Architecture
**Goal**: Design for easy addition of future integrations.
#### WS6-F1: Plugin tool registry
- Refactor `agent_tools.py` to use decorator-based registration:
```python
@agent_tool(name="web_search", description="...")
def web_search(query: str, location_context: str = None):
...
```
- Tools auto-register on import
- Easy to add new tools in separate files
#### WS6-F2: Integration hooks
- Create `chat/integrations/` directory for future:
- `tripadvisor.py` - TripAdvisor API integration
- `flights.py` - Flight search (Skyscanner, etc.)
- `weather.py` - Enhanced weather data
- Each integration exports tools via decorator
#### WS6-F3: Capability discovery
- Endpoint: `GET /api/ai/capabilities/`
- Returns list of available tools/integrations
- Frontend can show "Powered by X, Y, Z" dynamically
**Files**: `chat/tools/` (new directory), `chat/agent_tools.py` (refactor)
---
## File Changes Summary
### New Files
- `frontend/src/lib/components/collections/ItinerarySuggestionModal.svelte`
- `backend/server/chat/tools/__init__.py`
- `backend/server/chat/tools/web_search.py`
- `backend/server/integrations/models.py` (add UserAISettings)
- `backend/server/integrations/views/ai_settings_view.py`
### Modified Files
- `backend/server/main/settings.py` - Add AI env vars
- `backend/server/chat/llm_client.py` - Config fallback, preference aggregation
- `backend/server/chat/views.py` - New suggestion endpoint, context injection
- `backend/server/chat/agent_tools.py` - Web search tool, refactor
- `frontend/src/lib/components/AITravelChat.svelte` - Context awareness, actions
- `frontend/src/lib/components/collections/CollectionItineraryPlanner.svelte` - Add button
- `frontend/src/routes/settings/+page.svelte` - AI preferences UI, model selection
- `frontend/src/routes/collections/[id]/+page.svelte` - Pass collection context
---
## Migration Path
1. **Phase 1 - Foundation** (WS1, WS2)
- Configuration infrastructure
- Preference UI
- No user-facing changes to chat yet
2. **Phase 2 - Day Suggestions** (WS3)
- New modal, new entry point
- Backend suggestion endpoint
- Can ship independently
3. **Phase 3 - Chat Improvements** (WS4, WS5)
- Context-aware chat
- Web search capability
- Better UX
4. **Phase 4 - Extensibility** (WS6)
- Plugin architecture
- Future integration prep
---
## Decisions (Confirmed)
| Decision | Choice | Rationale |
|----------|--------|-----------|
| Web search provider | **DuckDuckGo** | Free, no API key, good enough for travel info |
| Suggestion API | **Dedicated REST endpoint** | Simpler, faster, returns JSON directly |
| Multi-user conflicts | **List all preferences** | Transparency - AI navigates differing preferences |
---
## Out of Scope
- WSGI→ASGI migration (keep current async-in-sync pattern)
- Role-based permissions (all shared users have same access)
- Real-time collaboration (WebSocket sync)
- Mobile-specific optimizations

View File

@@ -0,0 +1,248 @@
# Chat Provider Fixes
## Problem Statement
The AI chat feature is broken with multiple issues:
1. Rate limit errors from providers
2. "location is required" errors (tool calling issue)
3. "An unexpected error occurred while fetching trip details" errors
4. Models not being fetched properly for all providers
5. Potential authentication issues
## Root Cause Analysis
### Issue 1: Tool Calling Errors
The errors "location is required" and "An unexpected error occurred while fetching trip details" come from the agent tools (`search_places`, `get_trip_details`) being called with missing/invalid parameters. This suggests:
- The LLM is not properly understanding the tool schemas
- Or the model doesn't support function calling well
- Or there's a mismatch between how LiteLLM formats tools and what the model expects
### Issue 2: Models Not Fetched
The `models` endpoint in `ChatProviderCatalogViewSet` only handles:
- `openai` - uses OpenAI SDK to fetch live
- `anthropic/claude` - hardcoded list
- `gemini/google` - hardcoded list
- `groq` - hardcoded list
- `ollama` - calls local API
- `opencode_zen` - hardcoded list
All other providers return `{"models": []}`.
### Issue 3: Authentication Flow
1. Frontend sends request with `credentials: 'include'`
2. Backend gets user from session
3. `get_llm_api_key()` checks `UserAPIKey` model for user's key
4. Falls back to `settings.VOYAGE_AI_API_KEY` if user has no key and provider matches instance default
5. Key is passed to LiteLLM's `acompletion()`
Potential issues:
- Encryption key not configured correctly
- Key not being passed correctly to LiteLLM
- Provider-specific auth headers not being set
### Issue 4: LiteLLM vs Alternatives
Current approach (LiteLLM):
- Single library handles all providers
- Normalizes API calls across providers
- Built-in error handling and retries (if configured)
Alternative (Vercel AI SDK):
- Provider registry pattern with individual packages
- More explicit provider configuration
- Better TypeScript support
- But would require significant refactoring (backend is Python)
## Investigation Tasks
- [ ] Test the actual API calls to verify authentication
- [x] Check if models endpoint returns correct data
- [x] Verify tool schemas are being passed correctly
- [ ] Test with a known-working model (e.g., GPT-4o)
## Options
### Option A: Fix LiteLLM Integration (Recommended)
1. Add proper retry logic with `num_retries=2`
2. Add `supports_function_calling()` check before using tools
3. Expand models endpoint to handle more providers
4. Add better logging for debugging
### Option B: Replace LiteLLM with Custom Implementation
1. Use direct API calls per provider
2. More control but more maintenance
3. Significant development effort
### Option C: Hybrid Approach
1. Keep LiteLLM for providers it handles well
2. Add custom handlers for problematic providers
3. Medium effort, best of both worlds
## Status
### Completed (2026-03-09)
- [x] Implemented backend fixes for Option A:
1. `ChatProviderCatalogViewSet.models()` now fetches OpenCode Zen models dynamically from `{api_base}/models` using the configured provider API base and user API key; returns deduplicated model ids and logs fetch failures.
2. `stream_chat_completion()` now checks `litellm.supports_function_calling(model=resolved_model)` before sending tools and disables tools with a warning if unsupported.
3. Added LiteLLM transient retry configuration via `num_retries=2` on streaming completions.
4. Added request/error logging for provider/model/tool usage and API base/message count diagnostics.
### Verification Results
- Models endpoint: Returns 36 models from OpenCode Zen API (was 5 hardcoded)
- Function calling check: gpt-5-nano=True, claude-sonnet-4-6=True, big-pickle=False, minimax-m2.5=False
- Syntax check: Passed for both modified files
- Frontend check: 0 errors, 6 warnings (pre-existing)
### Remaining Issues (User Action Required)
- Rate limits: Free tier has limits, user may need to upgrade or wait
- Tool calling: Some models (big-pickle, minimax-m2.5) don't support function calling - tools will be disabled for these models
## Follow-up Fixes (2026-03-09)
### Clarified Behavior
- Approved preference precedence: database-saved default provider/model beats any per-device `localStorage` override.
- Requirement: user AI preferences must be persisted through the existing `UserAISettings` backend API and applied by both the settings UI and chat send-message fallback logic.
### Planned Workstreams
- [x] `chat-loop-hardening`
- Acceptance: invalid required-argument tool calls do not loop repeatedly, tool-error messages are not replayed back into the model history, and SSE streams terminate cleanly with a user-visible error or `[DONE]`.
- Files: `backend/server/chat/views/__init__.py`, `backend/server/chat/agent_tools.py`, optional `backend/server/chat/llm_client.py`
- Notes: preserve successful tool flows; stop feeding `{"error": "location is required"}` / `{"error": "query is required"}` back into the next model turn.
- Completion (2026-03-09): Added required-argument tool-error detection in `send_message()` streaming loop, short-circuited those tool failures with a user-visible SSE error + terminal `[DONE]`, skipped persistence/replay of those invalid tool payloads (including historical cleanup at `_build_llm_messages()`), and tightened `search_places`/`web_search` tool descriptions to explicitly call out required non-empty args.
- Follow-up (2026-03-09): Fixed multi-tool-call consistency by persisting/replaying only the successful prefix of `tool_calls` when a later call fails required-arg validation; `_build_llm_messages()` now trims assistant `tool_calls` to only IDs that have kept (non-filtered) persisted tool messages.
- Review verdict (2026-03-09): **APPROVED** (score 6). Two WARNINGs: (1) multi-tool-call orphan — when model returns N tool calls and call K fails required-param validation, calls 1..K-1 are already persisted but call K's result is not, leaving an orphaned `tool_calls` reference in the assistant message that may cause LLM API errors on the next conversation turn; (2) `_build_llm_messages` filters tool-role error messages but does not filter/trim the corresponding assistant-message `tool_calls` array, creating the same orphan on historical replay. Both are low-likelihood (multi-tool required-param failures are rare) and gracefully degraded (next-turn errors are caught by `_safe_error_payload`). One SUGGESTION: `get_weather` error `"dates must be a non-empty list"` does not match the `is/are required` regex and would not trigger the short-circuit (mitigated by `MAX_TOOL_ITERATIONS` guard). Also confirms prior pre-existing bug (`tool_iterations` never incremented) is now fixed in this changeset.
- [x] `default-ai-settings`
- Acceptance: settings page shows default AI provider/model controls, saving persists via `UserAISettings`, chat UI initializes from saved preferences, and backend chat fallback uses saved defaults when request payload omits provider/model.
- Files: `frontend/src/routes/settings/+page.server.ts`, `frontend/src/routes/settings/+page.svelte`, `frontend/src/lib/types.ts`, `frontend/src/lib/components/AITravelChat.svelte`, `backend/server/chat/views/__init__.py`
- Notes: DB-saved defaults override browser-local model prefs.
### Completion Note (2026-03-09)
- Implemented DB-backed default AI settings end-to-end: settings page now loads/saves `UserAISettings` via `/api/integrations/ai-settings/`, with provider/model selectors powered by provider catalog + per-provider models endpoint.
- Chat initialization now treats saved DB defaults as authoritative initial provider/model; stale `voyage_chat_model_prefs` localStorage values no longer override defaults and are synchronized to the saved defaults.
- Backend `send_message` now uses saved `UserAISettings` only when request payload omits provider/model, preserving explicit request values and existing provider validation behavior.
- Follow-up fix: backend model fallback now only applies `preferred_model` when the resolved provider matches `preferred_provider`, preventing cross-provider default model mismatches when users explicitly choose another provider.
- [x] `suggestion-add-flow`
- Acceptance: day suggestions use the user-configured/default provider/model instead of hardcoded OpenAI values, and adding a suggested place creates a location plus itinerary entry successfully.
- Files: `backend/server/chat/views/day_suggestions.py`, `frontend/src/lib/components/collections/ItinerarySuggestionModal.svelte`
- Notes: normalize suggestion payloads needed by `/api/locations/` and preserve existing add-item event wiring.
- Completion (2026-03-09): Day suggestions now resolve provider/model in precedence order (request payload → `UserAISettings` defaults → instance/provider defaults) without OpenAI hardcoding; modal now normalizes suggestion objects and builds stable `/api/locations/` payloads (name/location/description/rating) before dispatching existing `addItem` flow.
- Follow-up (2026-03-09): Removed remaining OpenAI-specific `gpt-4o-mini` fallback from day suggestions LLM call; endpoint now uses provider-resolved/default model only and fails safely when no model is configured.
- Follow-up (2026-03-09): Removed unsupported `temperature` from day suggestions requests, normalized bare `opencode_zen` model ids through the gateway (`openai/<model>`), and switched day suggestions error responses to the same sanitized categories used by chat. Browser result: the suggestion modal now completes normally (empty-state or rate-limit message) instead of crashing with a generic 500.
## Tester Validation — `default-ai-settings` (2026-03-09)
### STATUS: PASS
**Evidence from lead:** Authenticated POST `/api/integrations/ai-settings/` returned 200 and persisted; subsequent GET returned same values; POST `/api/chat/conversations/{id}/send_message/` with no provider/model in body used `preferred_provider='opencode_zen'` and `preferred_model='gpt-5-nano'` from DB, producing valid SSE stream.
**Standard pass findings:**
- `UserAISettings` model, serializer, and `UserAISettingsViewSet` are correct. Upsert logic in `perform_create` handles first-write and update-in-place correctly (single row per user via OneToOneField).
- `list()` returns `[serializer.data]` (wrapped array), which the frontend expects as `settings[0]` — contract matches.
- Backend `send_message` precedence: `requested_provider``preferred_provider` (if available) → `"openai"` fallback. `model` only inherits `preferred_model` when `provider == preferred_provider` — cross-provider default mismatch is correctly prevented (follow-up fix confirmed).
- Settings page initializes `defaultAiProvider`/`defaultAiModel` from SSR-loaded `aiSettings` and validates against provider catalog on `onMount`. If saved provider is no longer configured, it falls back to first configured provider.
- `AITravelChat.svelte` fetches AI settings on mount, applies as authoritative default, and writes to `localStorage` (sync direction is DB → localStorage, not the reverse).
- The `send_message` handler in the frontend always sends the current UI `selectedProvider`/`selectedModel`, not localStorage values directly — these are only used for UI state initialization, not bypassing DB defaults.
- All i18n keys present in `en.json`: `default_ai_settings_title`, `default_ai_settings_desc`, `default_ai_no_providers`, `default_ai_save`, `default_ai_settings_saved`, `default_ai_settings_error`, `default_ai_provider_required`.
- Django integration tests (5/5) pass; no tests exist for `UserAISettings` specifically — residual regression risk noted.
**Adversarial pass findings (all hypotheses did not find bugs):**
1. **Hypothesis: model saved for provider A silently applied when user explicitly sends provider B (cross-provider model leak).** Checked `send_message` lines 218220: `model = requested_model; if model is None and preferred_model and provider == preferred_provider: model = preferred_model`. When `requested_provider=B` and `preferred_provider=A`, `provider == preferred_provider` is false → `model` stays `None`. **Not vulnerable.**
2. **Hypothesis: null/empty preferred_model or preferred_provider in DB triggers error.** Serializer allows `null` on both fields (CharField with `blank=True, null=True`). Backend normalizes with `.strip().lower()` inside `(ai_settings.preferred_provider or "").strip().lower()` guard. Frontend uses `?? ''` coercion. **Handled safely.**
3. **Hypothesis: second POST to `/api/integrations/ai-settings/` creates a second row instead of updating.** `UserAISettings` uses `OneToOneField(user, ...)` + `perform_create` explicitly fetches and updates existing row. A second POST cannot produce a duplicate. **Not vulnerable.**
4. **Hypothesis: initializeDefaultAiSettings silently overwrites the saved DB provider with the first catalog provider if the saved provider is temporarily unavailable (e.g., API key deleted).** Confirmed: line 119121 does silently auto-select first available provider and blank the model if the saved provider is gone. This affects display only (not DB); the save action is still explicit. **Acceptable behavior; low risk.**
5. **Hypothesis: frontend sends `model: undefined` (vs `model: null`) when no model selected, causing backend to ignore it.** `requested_model = (request.data.get("model") or "").strip() or None` — if `undefined`/absent from JSON body, `get("model")` returns `None`, which becomes `None` after the guard. `model` variable falls through to default logic. **Works correctly.**
**MUTATION_ESCAPES: 1/8** — the regex `(is|are) required` in `_is_required_param_tool_error` (chat-loop-hardening code) would escape if a future required-arg error used a different pattern, but this is unrelated to `default-ai-settings` scope.
**Zero automated test coverage for `UserAISettings` CRUD + precedence logic.** Backend logic is covered only by the lead's live-run evidence. Recommended follow-up: add Django TestCase covering (a) upsert idempotency, (b) provider/model precedence in `send_message`, (c) cross-provider model guard.
## Tester Validation — `chat-loop-hardening` (2026-03-09)
### STATUS: PASS
**Evidence from lead (runtime):** Authenticated POST to `send_message` with patched upstream stream emitting `search_places {}` (missing required `location`) returned status 200, SSE body `data: {"tool_calls": [...]}``data: {"error": "...", "error_category": "tool_validation_error"}``data: [DONE]`. Persisted DB state after that turn: only `('user', None, 'restaurants please')` + `('assistant', None, '')` — no invalid `role=tool` error row.
**Standard pass findings:**
- `_is_required_param_tool_error`: correctly matches `location is required`, `query is required`, `collection_id is required`, `collection_id, name, latitude, and longitude are required`, `latitude and longitude are required`. Does NOT match non-required-arg errors (`dates must be a non-empty list`, `Trip not found`, `Unknown tool: foo`, etc.). All 18 test cases pass.
- `_is_required_param_tool_error_message_content`: correctly parses JSON-wrapped content from persisted DB rows and delegates to above. Handles non-JSON, non-dict JSON, and `error: null` safely. All 7 test cases pass.
- Orphan trimming in `_build_llm_messages`: when assistant has `tool_calls=[A, B]` and B's persisted tool row contains a required-param error, the rebuilt `assistant.tool_calls` retains only `[A]` and tool B's row is filtered. Verified for both the multi-tool case and the single-tool (lead's runtime) scenario.
- SSE stream terminates with `data: [DONE]` immediately after the `tool_validation_error` event — confirmed by code path at line 425426 which `return`s the generator.
- `MAX_TOOL_ITERATIONS = 10` correctly set; `tool_iterations` counter is incremented on each tool iteration (pre-existing bug confirmed fixed).
- `_merge_tool_call_delta` handles `None`, `[]`, missing `index`, and malformed argument JSON without crash.
- Full Django test suite: 24/30 pass; 6/30 fail (all pre-existing: 2 user email key errors + 4 geocoding API mock errors). Zero regressions introduced by this changeset.
**Adversarial pass findings:**
1. **Hypothesis: `get_weather` with empty `dates=[]` bypasses short-circuit and loops.** `get_weather` returns `{"error": "dates must be a non-empty list"}` which does NOT match the `is/are required` regex → not short-circuited. Falls through to `MAX_TOOL_ITERATIONS` guard (10 iterations max). **Known gap, mitigated by guard — confirmed matches reviewer WARNING.**
2. **Hypothesis: regex injection via crafted error text creates false-positive short-circuit.** Tested `'x is required; rm -rf /'` (semicolon breaks `fullmatch`), newline injection, Cyrillic lookalike. All return `False` correctly. **Not vulnerable.**
3. **Hypothesis: `assistant.tool_calls=[]` (empty list) pollutes rebuilt messages.** `filtered_tool_calls` is `[]` → the `if filtered_tool_calls:` guard prevents empty `tool_calls` key from being added to the payload. **Not vulnerable.**
4. **Hypothesis: `tool message content = None` is incorrectly classified as required-param error.** `_is_required_param_tool_error_message_content(None)` returns `False` (not a string → returns early). **Not vulnerable.**
5. **Hypothesis: `_build_required_param_error_event` crashes on None/missing `result`.** `result.get("error")` is guarded by `if isinstance(result, dict)` in caller; the static method itself handles `None` result via `isinstance` check and produces `error=""`. **No crash.**
6. **Hypothesis: multi-tool scenario — only partial `tool_calls` prefix trimmed correctly.** Tested assistant with `[A, B]` where A succeeds and B fails: rebuilt messages contain `tool_calls=[A]` only. Tested assistant with only `[X]` failing: rebuilt messages contain `tool_calls=None` (key absent). **Both correct.**
**MUTATION_ESCAPES: 1/7**`get_weather` returning `"dates must be a non-empty list"` not triggering the short-circuit. This is a known, reviewed, accepted gap (mitigated by `MAX_TOOL_ITERATIONS`). No other mutation checks escaped detection.
**FLAKY: 0**
**COVERAGE: N/A** — no automated test suite exists for the `chat` app; all validation is via unit-level method tests + lead's live-run evidence. Recommended follow-up: add Django `TestCase` for `send_message` streaming loop covering (a) single required-arg tool failure → short-circuit, (b) multi-tool partial success, (c) `MAX_TOOL_ITERATIONS` exhaustion, (d) `_build_llm_messages` orphan-trimming round-trip.
## Tester Validation — `suggestion-add-flow` (2026-03-09)
### STATUS: PASS
**Test run:** 30 Django tests (24 pass, 6 fail — all 6 pre-existing: 2 user email key errors + 4 geocoding mock failures). Zero new regressions. 44 targeted unit-level checks (42 pass, 2 fail — both failures confirmed as test-script defects, not code bugs).
**Standard pass findings:**
- `_resolve_provider_and_model` precedence verified end-to-end: explicit request payload → `UserAISettings.preferred_provider/model``settings.VOYAGE_AI_PROVIDER/MODEL` → provider-config default. All 4 precedence levels tested and confirmed correct.
- Cross-provider model guard confirmed: when request provider ≠ `preferred_provider`, the `preferred_model` is NOT applied (prevents `gpt-5-nano` from leaking to anthropic, etc.).
- Null/empty `preferred_provider`/`preferred_model` in `UserAISettings` handled safely (`or ""` coercion guards throughout).
- JSON parsing in `_get_suggestions_from_llm` is robust: handles clean JSON array, embedded JSON in prose, markdown-wrapped JSON, plain text (no JSON), empty string, `None` content — all return correct results (empty list or parsed list). Response capped at 5 items. Single-dict LLM response wrapped in list correctly.
- `normalizeSuggestionItem` normalization verified: non-dict returns `null`, missing name+location returns `null`, field aliases (`title``name`, `address``location`, `summary``description`, `score``rating`, `whyFits``why_fits`) all work. Whitespace-only name falls back to location.
- `rating=0` correctly preserved in TypeScript via `??` (nullish coalescing at line 171), not dropped. The Python port used `or` which drops `0`, but that's a test-script defect only.
- `buildLocationPayload` constructs a valid `LocationSerializer`-compatible payload: `name`, `location`, `description`, `rating`, `collections`, `is_public`. Falls back to collection location when suggestion has none.
- `handleAddSuggestion` → POST `/api/locations/``dispatch('addItem', {type:'location', itemId, updateDate:false})` wiring confirmed by code inspection (lines 274294). Parent `CollectionItineraryPlanner` handler at line 2626 calls `addItineraryItemForObject`.
**Adversarial pass findings:**
1. **Hypothesis: cross-provider model leak (gpt-5-nano applied to anthropic).** Tested `request.provider=anthropic` + `UserAISettings.preferred_provider=opencode_zen`, `preferred_model=gpt-5-nano`. Result: `model_from_user_defaults=None` (because `provider != preferred_provider`). **Not vulnerable.**
2. **Hypothesis: null/empty DB prefs cause exceptions.** `preferred_provider=None`, `preferred_model=None` — all guards use `(value or "").strip()` pattern. Falls through to `settings.VOYAGE_AI_PROVIDER` safely. **Not vulnerable.**
3. **Hypothesis: all-None provider/model/settings causes exception in `_resolve_provider_and_model`.** Tested with `is_chat_provider_available=False` everywhere, all settings None. Returns `(None, None)` without exception; caller checks `is_chat_provider_available(provider)` and returns 503. **Not vulnerable.**
4. **Hypothesis: missing API key causes silent empty result instead of error.** `get_llm_api_key` returns `None` → raises `ValueError("No API key available")` → caught by `post()` try/except → returns 500. **Explicit error path confirmed.**
5. **Hypothesis: no model configured causes silent failure.** `model=None` + empty `provider_config` → raises `ValueError("No model configured for provider")` → 500. **Explicit error path confirmed.**
6. **Hypothesis: `normalizeSuggestionItem` with mixed array (nulls, strings, invalid dicts).** `[None, {name:'A'}, 'string', {description:'only'}, {name:'B'}]` → after normalize+filter: 2 valid items. **Correct.**
7. **Hypothesis: rating=0 dropped by falsy check.** Actual TS uses `item.rating ?? item.score` (nullish coalescing, not `||`). `normalizeRating(0)` returns `0` (finite number check). **Not vulnerable in actual code.**
8. **Hypothesis: XSS in name field.** `<script>alert(1)</script>` passes through as a string; Django serializer stores as text, template rendering escapes it. **Not vulnerable.**
9. **Hypothesis: double-click `handleAddSuggestion` creates duplicate location.** `isAdding` guard at line 266 exits early if `isAdding` is truthy — prevents re-entrancy. **Protected by UI-state guard.**
**Known low-severity defect (pre-existing, not introduced by this workstream):** LLM-generated `name`/`location` fields are not truncated before passing to `LocationSerializer` (max_length=200). If LLM returns a name > 200 chars, the POST to `/api/locations/` returns 400 and the frontend shows a generic error. Risk is very low in practice (LLM names are short). Recommended fix: add `.slice(0, 200)` in `buildLocationPayload` for `name` and `location` fields.
**MUTATION_ESCAPES: 1/9**`rating=0` would escape mutation detection in naive Python tests (but is correctly handled in the actual TS `??` code). No logic mutations escape in the backend Python code.
**FLAKY: 0**
**COVERAGE: N/A** — no automated suite for `chat` or `suggestions` app. All validation via unit-level method tests + provider/model resolution checks. Recommended follow-up: add Django `TestCase` for `DaySuggestionsView.post()` covering (a) missing required fields → 400, (b) invalid category → 400, (c) unauthorized collection → 403, (d) provider unavailable → 503, (e) LLM exception → 500, (f) happy path → 200 with `suggestions` array.
**Cleanup required:** Two test artifact files left on host (not git-tracked, safe to delete):
- `/home/alex/projects/voyage/test_suggestion_flow.py`
- `/home/alex/projects/voyage/suggestion-modal-error-state.png`

View File

@@ -0,0 +1,401 @@
# Plan: Fix OpenCode Zen connection errors in AI travel chat
## Clarified requirements
- User configured provider `opencode_zen` in Settings with API key.
- Chat attempts return a generic connection error.
- Goal: identify root cause and implement a reliable fix for OpenCode Zen chat connectivity.
- Follow-up: add model selection in chat composer (instead of forced default model) and persist chosen model per user.
## Acceptance criteria
- Sending a chat message with provider `opencode_zen` no longer fails with a connection error due to Voyage integration/configuration.
- Backend provider routing for `opencode_zen` uses a validated OpenAI-compatible request shape and model format.
- Frontend surfaces backend/provider errors with actionable detail (not only generic connection failure) when available.
- Validation commands run successfully (or known project-expected failures only) and results recorded.
## Tasks
- [ ] Discovery: inspect current OpenCode Zen provider configuration and chat request pipeline (Agent: explorer)
- [ ] Discovery: verify OpenCode Zen API compatibility requirements vs current implementation (Agent: researcher)
- [ ] Discovery: map model-selection edit points and persistence path (Agent: explorer)
- [x] Implement fix for root cause + model selection/persistence (Agent: coder)
- [x] Correctness review of targeted changes (Agent: reviewer) — APPROVED (score 0)
- [x] Standard validation run and targeted chat-path checks (Agent: tester)
- [x] Documentation and knowledge sync for provider troubleshooting notes (Agent: librarian)
## Researcher findings
**Root cause**: Two mismatches in `backend/server/chat/llm_client.py` lines 59-64:
1. **Invalid model ID**`default_model: "openai/gpt-4o-mini"` does not exist on OpenCode Zen. Zen has its own model catalog (gpt-5-nano, glm-5, kimi-k2.5, etc.). Sending `gpt-4o-mini` to the Zen API results in a model-not-found error.
2. **Endpoint routing** — GPT models on Zen use `/responses` endpoint, but LiteLLM's `openai/` prefix routes through the OpenAI Python client which appends `/chat/completions`. The `/chat/completions` endpoint only works for OpenAI-compatible models (GLM, Kimi, MiniMax, Qwen, Big Pickle).
**Error flow**: LiteLLM exception → caught by generic handler at line 274 → yields `"An error occurred while processing your request"` SSE → frontend shows either this message or falls back to `$t('chat.connection_error')`.
**Recommended fix** (primary — `llm_client.py:62`):
- Change `"default_model": "openai/gpt-4o-mini"``"openai/gpt-5-nano"` (free model, confirmed to work via `/chat/completions` by real-world usage in multiple repos)
**Secondary fix** (error surfacing — `llm_client.py:274-276`):
- Extract meaningful error info from LiteLLM exceptions (status_code, message) instead of swallowing all details into a generic message
Full analysis: [research/opencode-zen-connection-debug.md](../research/opencode-zen-connection-debug.md)
## Retry tracker
- OpenCode Zen connection fix task: 0
## Implementation checkpoint (coder)
- Added composer-level model selection + per-provider browser persistence in `frontend/src/lib/components/AITravelChat.svelte` using localStorage key `voyage_chat_model_prefs`.
- Added `chat.model_label` and `chat.model_placeholder` i18n keys in `frontend/src/locales/en.json`.
- Extended `send_message` backend intake in `backend/server/chat/views.py` to read optional `model` (`empty -> None`) and pass it to streaming.
- Updated `backend/server/chat/llm_client.py` to:
- switch `opencode_zen` default model to `openai/gpt-5-nano`,
- accept optional `model` override in `stream_chat_completion(...)`,
- apply safe provider/model compatibility guard (skip strict prefix check for custom `api_base` gateways),
- map known LiteLLM exception classes to sanitized user-safe error categories/messages,
- include `tools` / `tool_choice` kwargs only when tools are present.
See related analysis in [research notes](../research/opencode-zen-connection-debug.md#model-selection-implementation-map).
---
## Explorer findings (model selection)
**Date**: 2026-03-08
**Full detail**: [research/opencode-zen-connection-debug.md — Model selection section](../research/opencode-zen-connection-debug.md#model-selection-implementation-map)
### Persistence decision: `localStorage` (no migration)
**Recommended**: store `{ [provider_id]: model_string }` in `localStorage` key `voyage_chat_model_prefs`.
Rationale:
- No existing per-user model preference field anywhere in DB/API
- Adding a DB column to `CustomUser` requires a migration + serializer + API change → 4+ files
- `UserAPIKey` stores only encrypted API keys (not preferences)
- Model preference is UI-volatile (the model catalog changes; stale DB entries require cleanup)
- `localStorage` is already used elsewhere in the frontend for similar ephemeral UI state
- Model preference is not sensitive; persisting client-side is consistent with how the provider selector already works (no backend persistence either)
- **No migration required** for localStorage approach
### File-by-file edit plan (exact symbols)
#### Backend: `backend/server/chat/llm_client.py`
- `stream_chat_completion(user, messages, provider, tools=None)` → add `model: str | None = None` parameter
- Line 226: `"model": provider_config["default_model"]``"model": model or provider_config["default_model"]`
- Add validation: if `model` is not `None`, check it starts with a valid LiteLLM provider prefix (or matches a known-safe pattern); reject bare model strings that don't include provider prefix
#### Backend: `backend/server/chat/views.py`
- `send_message()` (line 104): extract `model = (request.data.get("model") or "").strip() or None`
- Pass `model=model` to `stream_chat_completion()` call (line 144)
- Add validation: if `model` is provided, confirm it belongs to the same provider family (prefix check); return 400 if mismatch
#### Frontend: `frontend/src/lib/types.ts`
- No change needed — `ChatProviderCatalogEntry.default_model` already exists
#### Frontend: `frontend/src/lib/components/AITravelChat.svelte`
- Add `let selectedModel: string = ''` (reset when provider changes)
- Add reactive: `$: selectedProviderEntry = chatProviders.find(p => p.id === selectedProvider) ?? null`
- Add reactive: `$: { if (selectedProviderEntry) { selectedModel = loadModelPref(selectedProvider) || selectedProviderEntry.default_model || ''; } }`
- `sendMessage()` line 121: body `{ message: msgText, provider: selectedProvider }``{ message: msgText, provider: selectedProvider, model: selectedModel }`
- Add model input field in the composer toolbar (near provider `<select>`, line 290-299): `<input type="text" class="input input-bordered input-sm" bind:value={selectedModel} placeholder={selectedProviderEntry?.default_model ?? ''} />`
- Add `loadModelPref(provider)` / `saveModelPref(provider, model)` functions using `localStorage` key `voyage_chat_model_prefs`
- Add `$: saveModelPref(selectedProvider, selectedModel)` reactive to persist on change
#### Frontend: `frontend/src/locales/en.json`
- Add `"chat.model_label"`: `"Model"` (label for model input)
- Add `"chat.model_placeholder"`: `"Default model"` (placeholder when empty)
### Validation constraints / risks
1. **Model-provider prefix mismatch**: `stream_chat_completion` uses `provider_config["default_model"]` prefix to route via LiteLLM. If user passes `openai/gpt-5-nano` for the `anthropic` provider, LiteLLM will try to call OpenAI with Anthropic credentials. Backend must validate that the supplied model string starts with the expected provider prefix or reject it.
2. **Free-text model field**: No enumeration from backend; user types any string. Validation (prefix check) is the only guard.
3. **localStorage staleness**: If a provider removes a model, the stored preference produces a LiteLLM error — the error surfacing fix (Fix #2 in existing plan) makes this diagnosable.
4. **Empty string vs null**: Frontend should send `model: selectedModel || undefined` (omit key if empty) to preserve backend default behavior.
### No migration required
All backend changes are parameter additions to existing function signatures + optional request field parsing. No DB schema changes.
---
## Explorer findings
**Date**: 2026-03-08
**Detail**: Full trace in [research/opencode-zen-connection-debug.md](../research/opencode-zen-connection-debug.md)
### End-to-end path (summary)
```
AITravelChat.svelte:sendMessage()
POST /api/chat/conversations/<id>/send_message/ { message, provider:"opencode_zen" }
→ +server.ts:handleRequest() [CSRF refresh + proxy, SSE passthrough lines 94-98]
→ views.py:ChatViewSet.send_message() [validates provider, saves user msg]
→ llm_client.py:stream_chat_completion() [builds kwargs, calls litellm.acompletion]
→ litellm.acompletion(model="openai/gpt-4o-mini", api_base="https://opencode.ai/zen/v1")
→ POST https://opencode.ai/zen/v1/chat/completions ← FAILS: model not on Zen
→ except Exception at line 274 → data:{"error":"An error occurred..."}
← frontend shows error string inline (or "Connection error." on network failure)
```
### Ranked root causes confirmed by code trace
1. **[CRITICAL] Wrong default model** (`openai/gpt-4o-mini` is not a Zen model)
- `backend/server/chat/llm_client.py:62`
- Fix: change to `"openai/gpt-5-nano"` (free, confirmed OpenAI-compat via `/chat/completions`)
2. **[SIGNIFICANT] Generic exception handler masks provider errors**
- `backend/server/chat/llm_client.py:274-276`
- Bare `except Exception:` swallows LiteLLM structured exceptions (NotFoundError, AuthenticationError, etc.)
- Fix: extract `exc.status_code` / `exc.message` and forward to SSE error payload
3. **[SIGNIFICANT] WSGI + per-request event loop for async LiteLLM**
- Backend runs **Gunicorn WSGI** (`supervisord.conf:11`); no ASGI entry point exists
- `views.py:66-76` `_async_to_sync_generator` creates `asyncio.new_event_loop()` per request
- LiteLLM httpx sessions may not be compatible with per-call new loops → potential connection errors on the second+ tool iteration
- Fix: wrap via `asyncio.run()` or migrate to ASGI (uvicorn)
4. **[MINOR] `tool_choice: None` / `tools: None` passed as kwargs when unused**
- `backend/server/chat/llm_client.py:227-229`
- Fix: conditionally include keys only when tools are present
5. **[MINOR] Synchronous ORM call inside async generator**
- `backend/server/chat/llm_client.py:217``get_llm_api_key()` calls `UserAPIKey.objects.get()` synchronously
- Fine under WSGI+new-event-loop but technically incorrect for async context
- Fix: wrap with `sync_to_async` or move key lookup before entering async boundary
### Minimal edit points for a fix
| Priority | File | Location | Change |
|---|---|---|---|
| 1 (required) | `backend/server/chat/llm_client.py` | line 62 | `"default_model": "openai/gpt-5-nano"` |
| 2 (recommended) | `backend/server/chat/llm_client.py` | lines 274-276 | Extract `exc.status_code`/`exc.message` for user-facing error |
| 3 (recommended) | `backend/server/chat/llm_client.py` | lines 225-234 | Only include `tools`/`tool_choice` keys when tools are provided |
---
## Critic gate
**VERDICT**: APPROVED
**Date**: 2026-03-08
**Reviewer**: critic agent
### Rationale
The plan is well-scoped, targets a verified root cause with clear code references, and all three changes are in a single file (`llm_client.py`) within the same request path. This is a single coherent bug fix, not a multi-feature plan — no decomposition required.
### Assumption challenges
1. **`gpt-5-nano` validity on Zen** — The researcher claims this model is confirmed via GitHub usage patterns, but there is no live API verification. The risk is mitigated by Fix #2 (error surfacing), which would make any remaining model mismatch immediately diagnosable. **Accepted with guardrail**: coder must add a code comment noting the model was chosen based on research, and tester must verify the error path produces a meaningful message if the model is still wrong.
2. **`@mdi/js` build failure is NOT a baseline issue** — `@mdi/js` is a declared dependency in `package.json:44` but `node_modules/` is absent in this worktree. Running `bun install` will resolve this. **Guardrail**: Coder must run `bun install` before the validation pipeline; do not treat this as a known/accepted failure.
3. **Error surfacing may leak sensitive info** — Forwarding raw `exc.message` from LiteLLM exceptions could expose `api_base` URLs, internal config, or partial request data. Prior security review (decisions.md:103) already flagged `api_base` leakage as unnecessary. **Guardrail**: The error surfacing fix must sanitize exception messages — use only `exc.status_code` and a generic category (e.g., "authentication error", "model not found", "rate limit exceeded"), NOT raw `exc.message`. Map known LiteLLM exception types to safe user-facing descriptions.
### Scope guardrails for implementation
1. **In scope**: Fixes #1, #2, #3 from the plan table (model name, error surfacing, tool_choice cleanup) — all in `backend/server/chat/llm_client.py`.
2. **Out of scope**: Fix #3 from Explorer findings (WSGI→ASGI migration), Fix #5 (sync_to_async ORM). These are structural improvements, not root cause fixes.
3. **No frontend changes** unless the error message format changes require corresponding updates to `AITravelChat.svelte` parsing — verify and include only if needed.
4. **Error surfacing must sanitize**: Map LiteLLM exception classes (`NotFoundError`, `AuthenticationError`, `RateLimitError`, `BadRequestError`) to safe user-facing categories. Do NOT forward raw `exc.message` or `str(exc)`.
5. **Validation**: Run `bun install` first, then full pre-commit checklist (`format`, `lint`, `check`, `build`). Backend `manage.py check` must pass. If possible, manually test the chat SSE error path with a deliberately bad model name to confirm error surfacing works.
6. **No new dependencies, no migrations, no schema changes** — none expected and none permitted for this fix.
---
## Reviewer security verdict
**VERDICT**: APPROVED
**LENS**: Security
**REVIEW_SCORE**: 3
**Date**: 2026-03-08
### Security goals evaluated
| Goal | Status | Evidence |
|---|---|---|
| 1. Error handling doesn't leak secrets/api_base/raw internals | ✅ PASS | `_safe_error_payload()` maps exception classes to hardcoded user-safe strings; no `str(exc)`, `exc.message`, or `exc.args` forwarded. Logger.exception at line 366 is server-side only. Critic guardrail (decisions.md:189) fully satisfied. |
| 2. Model override input can't bypass provider constraints dangerously | ✅ PASS | Model string used only as JSON field in `litellm.acompletion()` kwargs. No SQL, no shell, no eval, no path traversal. `_is_model_override_compatible()` validates prefix for standard providers. Gateway providers (`api_base` set) skip prefix check — correct by design, worst case is provider returns an error caught by sanitized handler. |
| 3. No auth/permission regressions in send_message | ✅ PASS | `IsAuthenticated` + `get_queryset(user=self.request.user)` unchanged. New `model` param is additive-only, doesn't bypass existing validation. Tool execution scopes all DB queries to `user=user`. |
| 4. localStorage stores no sensitive values | ✅ PASS | Key `voyage_chat_model_prefs` stores `{provider_id: model_string}` only. SSR-safe guards present. Try/catch on JSON parse/write. |
### Findings
**CRITICAL**: (none)
**WARNINGS**:
- `[llm_client.py:194,225]` `api_base` field exposed in provider catalog response to frontend — pre-existing from prior consolidated review (decisions.md:103), not newly introduced. Server-defined constants only (not user-controllable), no SSRF. Frontend type includes field but never renders or uses it. (confidence: MEDIUM)
**SUGGESTIONS**:
1. Consider adding a `max_length` check on the `model` parameter in `views.py:114` (e.g., reject if >200 chars) as defense-in-depth against pathological inputs, though Django's request size limits provide a baseline guard.
2. Consider omitting `api_base` from the provider catalog response to frontend since the frontend never uses this value (pre-existing — tracked since prior security review).
### Prior findings cross-check
- **Critic guardrail** (decisions.md:119-123 — "Error surfacing must NOT forward raw exc.message"): **CONFIRMED** — implementation uses class-based dispatch to hardcoded strings.
- **Prior security review** (decisions.md:98-115 — api_base exposure, provider validation, IDOR checks): **CONFIRMED** — all findings still valid, no regressions.
- **Explorer model-provider prefix mismatch warning** (plan lines 108-109): **CONFIRMED**`_is_model_override_compatible()` implements the recommended validation.
### Tracker states
- [x] Security goal 1: sanitized error handling (PASS)
- [x] Security goal 2: model override safety (PASS)
- [x] Security goal 3: auth/permission integrity (PASS)
- [x] Security goal 4: localStorage safety (PASS)
---
## Reviewer correctness verdict
**VERDICT**: APPROVED
**LENS**: Correctness
**REVIEW_SCORE**: 0
**Date**: 2026-03-08
### Requirements verification
| Requirement | Status | Evidence |
|---|---|---|
| Chat composer model selection | ✅ PASS | `AITravelChat.svelte:346-353` — text input bound to `selectedModel`, placed in composer header next to provider selector. Disabled when no providers available. |
| Per-provider browser persistence | ✅ PASS | `loadModelPref`/`saveModelPref` (lines 60-92) use `localStorage` key `voyage_chat_model_prefs`. Provider change loads saved preference via `initializedModelProvider` sentinel (lines 94-98). User edits auto-save via reactive block (lines 100-102). JSON parse errors caught. SSR guards present. |
| Optional model passed to backend | ✅ PASS | Frontend sends `model: selectedModel.trim() || undefined` (line 173). Backend extracts `model = (request.data.get("model") or "").strip() or None` (views.py:114). Passed as `model=model` to `stream_chat_completion` (views.py:150). |
| Model used as override in backend | ✅ PASS | `completion_kwargs["model"] = model or provider_config["default_model"]` (llm_client.py:316). Null/empty correctly falls back to provider default. |
| No regressions in provider selection/send flow | ✅ PASS | Provider selection, validation, SSE streaming all unchanged except additive `model` param. Error field format compatible with existing frontend parsing (`parsed.error` at line 210). |
| Error category mapping coherent with frontend | ✅ PASS | Backend `_safe_error_payload` returns `{"error": "...", "error_category": "..."}`. Frontend checks `parsed.error` (human-readable string) and displays it. `error_category` available for future programmatic use. HTTP 400 errors also use `err.error` pattern (lines 177-183). |
### Correctness checklist
- **Off-by-one**: N/A — no index arithmetic in changes.
- **Null/undefined dereference**: `selectedProviderEntry?.default_model ?? ''` and `|| $t(...)` — null-safe. Backend `model or provider_config["default_model"]` — None-safe.
- **Ignored errors**: `try/catch` in `loadModelPref`/`saveModelPref` returns safe defaults. Backend exception handler maps to user-facing messages.
- **Boolean logic**: Reactive guard `initializedModelProvider !== selectedProvider` correctly gates initialization vs save paths.
- **Async/await**: No new async code in frontend. Backend `model` param is synchronous extraction before async boundary.
- **Race conditions**: None introduced — `selectedModel` is single-threaded Svelte state.
- **Resource leaks**: None — localStorage access is synchronous and stateless.
- **Unsafe defaults**: Model defaults to provider's `default_model` when empty — safe.
- **Dead/unreachable branches**: Pre-existing `tool_iterations` (views.py:139-141, never incremented) — not introduced by this change.
- **Contract violations**: Function signature `stream_chat_completion(user, messages, provider, tools=None, model=None)` matches all call sites. `_is_model_override_compatible` return type is bool, used correctly in conditional.
- **Reactive loop risk**: Verified — `initializedModelProvider` sentinel prevents re-entry between Block 1 (load) and Block 2 (save). `saveModelPref` has no state mutations → no cascading reactivity.
### Findings
**CRITICAL**: (none)
**WARNINGS**: (none)
**SUGGESTIONS**:
1. `[AITravelChat.svelte:100-102]` Save-on-every-keystroke reactive block calls `saveModelPref` on each character typed. Consider debouncing or saving on blur/submit to reduce localStorage churn.
2. `[llm_client.py:107]` `getattr(exceptions, "NotFoundError", tuple())``isinstance(exc, ())` is always False by design (graceful fallback). A brief inline comment would clarify intent for future readers.
### Prior findings cross-check
- **Critic gate guardrails** (decisions.md:117-124): All 3 guardrails confirmed followed (sanitized errors, `bun install` prerequisite, WSGI migration out of scope).
- **`opencode_zen` default model**: Changed from `openai/gpt-4o-mini``openai/gpt-5-nano` as prescribed by researcher findings.
- **`api_base` catalog exposure** (decisions.md:103): Pre-existing, unchanged by this change.
- **`tool_iterations` dead guard** (decisions.md:91): Pre-existing, not affected by this change.
### Tracker states
- [x] Correctness goal 1: model selection end-to-end (PASS)
- [x] Correctness goal 2: per-provider persistence (PASS)
- [x] Correctness goal 3: model override to backend (PASS)
- [x] Correctness goal 4: no provider/send regressions (PASS)
- [x] Correctness goal 5: error mapping coherence (PASS)
---
## Tester verdict (standard + adversarial)
**STATUS**: PASS
**PASS**: Both (Standard + Adversarial)
**Date**: 2026-03-08
### Commands run
| Command | Result |
|---|---|
| `docker compose exec server python3 manage.py check` | PASS — 0 issues (1 silenced, expected) |
| `bun run check` (frontend) | PASS — 0 errors, 6 warnings (all pre-existing in `CollectionRecommendationView.svelte` + `RegionCard.svelte`, not in changed files) |
| `docker compose exec server python3 manage.py test --keepdb` | 30 tests found; pre-existing failures: 2 user tests (email field key error) + 4 geocoding tests (Google API mock) = 6 failures (matches documented "2/3 fail" baseline). No regressions. |
| Chat module static path validation (Django context) | PASS — all 5 targeted checks |
| `bun run build` | Vite compilation PASS (534 modules SSR, 728 client). EACCES error on `build/` dir is a pre-existing Docker worktree permission issue, not a compilation failure. |
### Targeted checks verified
- [x] `opencode_zen` default model is `openai/gpt-5-nano`**CONFIRMED**
- [x] `stream_chat_completion` accepts `model: str | None = None` parameter — **CONFIRMED**
- [x] Empty/whitespace/falsy `model` values in `views.py` produce `None` (falls back to provider default) — **CONFIRMED**
- [x] `_safe_error_payload` does NOT leak raw exception text, `api_base`, or sensitive data — **CONFIRMED** (all 6 LiteLLM exception classes mapped to sanitized hardcoded strings)
- [x] `_is_model_override_compatible` skips prefix check for `api_base` gateways — **CONFIRMED**
- [x] Standard providers reject cross-provider model prefixes — **CONFIRMED**
- [x] `is_chat_provider_available` rejects null, empty, and adversarial provider IDs — **CONFIRMED**
- [x] i18n keys `chat.model_label` and `chat.model_placeholder` present in `en.json`**CONFIRMED**
- [x] `tools`/`tool_choice` kwargs excluded from `completion_kwargs` when `tools` is falsy — **CONFIRMED**
### Adversarial attempts
| Hypothesis | Test | Expected failure signal | Observed result |
|---|---|---|---|
| 1. Pathological model strings (long/unicode/injection/null-byte) crash `_is_model_override_compatible` | 500-char model, unicode model, SQL injection model, null-byte model | Exception or incorrect behavior | PASS — no crashes, all return True/False correctly |
| 2. LiteLLM exception classes with sensitive data in `message` field leak via `_safe_error_payload` | All 6 LiteLLM exception classes instantiated with sensitive marker string | Sensitive data in SSE payload | PASS — all 6 classes return sanitized hardcoded payloads |
| 3. Empty/whitespace/falsy model string bypasses `None` conversion in `views.py` | `""`, `" "`, `None`, `False`, `0` passed to views.py extraction | Model sent as empty string to LiteLLM | PASS — all produce `None`, triggering default fallback |
| 4. All CHAT_PROVIDER_CONFIG providers have `default_model=None` (would cause `model=None` to LiteLLM) | Check each provider's `default_model` value | At least one None | PASS — all 9 providers have non-null `default_model` |
| 5. Unknown provider without slash in `default_model` causes unintended prefix extraction | Provider not in `PROVIDER_MODEL_PREFIX` + bare `default_model` | Cross-prefix model rejected | PASS — no expected_prefix extracted from bare default → pass-through |
| 6. Adversarial provider IDs (`__proto__`, null-byte, SQL injection, path traversal) bypass availability check | Injected strings to `is_chat_provider_available` | Available=True for injected ID | PASS — all rejected. Note: `openai\n` returns True because `strip()` normalizes to `openai` (correct, consistent with views.py normalization). |
| 7. `_merge_tool_call_delta` with `None`, empty list, missing `index` key | Edge case inputs | Crash or wrong accumulator state | PASS — None/empty are no-ops; missing index defaults to 0 |
| 7b. Large index (9999) to `_merge_tool_call_delta` causes DoS via huge list allocation | `index=9999` | Memory spike | NOTE (pre-existing, not in scope) — creates 10000-entry accumulator; pre-existing behavior |
| 8. model fallback uses `and` instead of `or` | Verify `model or default` not `model and default` | Wrong model when set | PASS — `model or default` correctly preserves explicit model |
| 9. `tools=None` causes None kwargs to LiteLLM | Verify conditional exclusion | `tool_choice=None` in kwargs | PASS — `if tools:` guard correctly excludes both kwargs when None |
### Mutation checks
| Mutation | Critical logic | Detected by tests? |
|---|---|---|
| `_is_model_override_compatible`: `not model OR api_base``not model AND api_base` | Gateway bypass | DETECTED — test covers api_base set + model set case |
| `_merge_tool_call_delta`: `len(acc) <= idx``len(acc) < idx` | Off-by-one in accumulator growth | DETECTED — index=0 on empty list tested |
| `completion_kwargs["model"]`: `model or default``model and default` | Model fallback | DETECTED — both None and set-model cases tested |
| `is_chat_provider_available` negation | Provider validation gate | DETECTED — True and False cases both verified |
| `_safe_error_payload` exception dispatch order | Error sanitization | DETECTED — LiteLLM exception MRO verified, no problematic inheritance |
**MUTATION_ESCAPES: 0/5**
### Findings
**CRITICAL**: (none)
**WARNINGS** (pre-existing, not introduced by this change):
- `_merge_tool_call_delta` large index: no upper bound on accumulator size (pre-existing DoS surface; not in scope per critic gate)
- `tool_iterations` never incremented (pre-existing dead guard; not in scope)
**SUGGESTIONS** (carry-forward from reviewer):
1. Debounce `saveModelPref` on model input (every-keystroke localStorage writes)
2. Add clarifying comment on `getattr(exceptions, "NotFoundError", tuple())` fallback pattern
### Task tracker update
- [x] Standard validation run and targeted chat-path checks (Agent: tester) — PASS
- [x] Documentation and knowledge sync for provider troubleshooting notes (Agent: librarian) — COMPLETE
---
## Librarian coverage verdict
**STATUS**: COMPLETE
**Date**: 2026-03-08
### Files updated
| File | Changes | Reason |
|---|---|---|
| `README.md` | Added model selection, error handling, and `gpt-5-nano` default to AI Chat section | User-facing docs now reflect model override and error surfacing features |
| `docs/docs/usage/usage.md` | Added model override and error messaging to AI Travel Chat section | Usage guide now covers model input and error behavior |
| `.memory/knowledge.md` | Added 3 new sections: Chat Model Override Pattern, Sanitized LLM Error Mapping, OpenCode Zen Provider. Updated AI Chat section with model override + error mapping refs. Updated known issues baseline (0 errors/6 warnings, 6/30 test failures). | Canonical project knowledge now covers all new patterns for future sessions |
| `AGENTS.md` | Added model override + error surfacing to AI chat description and Key Patterns. Updated known issues baseline. | OpenCode instruction file synced |
| `CLAUDE.md` | Same changes as AGENTS.md (AI chat description, key patterns, known issues) | Claude Code instruction file synced |
| `.github/copilot-instructions.md` | Added model override + error surfacing to AI Chat description. Updated known issues + command output baselines. | Copilot instruction file synced |
| `.cursorrules` | Updated known issues baseline. Added chat model override + error surfacing conventions. | Cursor instruction file synced |
### Knowledge propagation
- **Inward merge**: No new knowledge found in instruction files that wasn't already in `.memory/`. All instruction files were behind `.memory/` state.
- **Outward sync**: All 4 instruction files updated with: (1) model override pattern, (2) sanitized error mapping, (3) `opencode_zen` default model `openai/gpt-5-nano`, (4) corrected known issues baseline.
- **Cross-references**: knowledge.md links to plan file for model selection details and to decisions.md for critic gate guardrail. New sections cross-reference each other (error mapping → decisions.md, model override → plan).
### Not updated (out of scope)
- `docs/architecture.md` — Stub file; model override is an implementation detail, not architectural. The chat app entry already exists.
- `docs/docs/guides/travel_agent.md` — MCP endpoint docs; unrelated to in-app chat model selection.
- `docs/docs/configuration/advanced_configuration.md` — Chat uses per-user API keys (no server-side env vars); no config changes to document.
### Task tracker
- [x] Documentation and knowledge sync for provider troubleshooting notes (Agent: librarian)

View File

@@ -0,0 +1,36 @@
# Plan: Pre-release policy + .memory migration
## Scope
- Update project instruction files to treat Voyage as pre-release (no production compatibility constraints yet).
- Migrate `.memory/` to the standardized structure defined in AGENTS guidance.
## Tasks
- [x] Add pre-release policy guidance in instruction files (`AGENTS.md` + synced counterparts).
- **Acceptance**: Explicit statement that architecture-level changes (including replacing LiteLLM) are allowed in pre-release, with preference for correctness over backward compatibility.
- **Agent**: librarian
- **Note**: Added identical "Pre-Release Policy" section to all 4 instruction files (AGENTS.md, CLAUDE.md, .cursorrules, .github/copilot-instructions.md). Also updated `.memory Files` section in AGENTS.md, CLAUDE.md, .cursorrules to reference new nested structure.
- [x] Migrate `.memory/` to standard structure.
- **Acceptance**: standardized directories/files exist (`manifest.yaml`, `system.md`, `knowledge/*`, `plans/`, `research/`, `gates/`, `sessions/`), prior knowledge preserved/mapped, and manifest entries are updated.
- **Agent**: librarian
- **Note**: Decomposed `knowledge.md` (578 lines) into 7 nested files. Old `knowledge.md` marked DEPRECATED with pointers. Manifest updated with all new entries. Created `gates/`, `sessions/continuity.md`.
- [x] Validate migration quality.
- **Acceptance**: no broken references in migrated memory docs; concise migration note included in plan.
- **Agent**: librarian
- **Note**: Cross-references updated in decisions.md (knowledge.md -> knowledge/overview.md). All new files cross-link to decisions.md, plans/, and each other.
## Migration Map (old -> new)
| Old location | New location | Content |
|---|---|---|
| `knowledge.md` §Project Overview | `system.md` | One-paragraph project overview |
| `knowledge.md` §Architecture, §Services, §Auth, §Key File Locations | `knowledge/overview.md` | Architecture, API proxy, AI chat, services, auth, file locations |
| `knowledge.md` §Dev Commands, §Pre-Commit, §Environment, §Known Issues | `knowledge/tech-stack.md` | Stack, commands, env vars, known issues |
| `knowledge.md` §Key Patterns | `knowledge/conventions.md` | Frontend/backend coding patterns, workflow conventions |
| `knowledge.md` §Chat Model Override, §Error Mapping, §OpenCode Zen, §Agent Tools, §Backend Chat Endpoints, §WS4, §Context Derivation | `knowledge/patterns/chat-and-llm.md` | All chat/LLM implementation patterns |
| `knowledge.md` §Collection Sharing, §Itinerary, §User Preferences | `knowledge/domain/collections-and-sharing.md` | Collections domain knowledge |
| `knowledge.md` §WS1 Config, §Frontend Gaps | `knowledge/domain/ai-configuration.md` | AI configuration domain |
| (new) | `sessions/continuity.md` | Session continuity notes |
| (new) | `gates/.gitkeep` | Quality gates directory placeholder |
| `knowledge.md` | `knowledge.md` (DEPRECATED) | Deprecation notice with pointers to new locations |

View File

@@ -0,0 +1,675 @@
# Plan: Travel Agent Context + Models Follow-up
## Scope
Address three follow-up issues in collection-level AI Travel Assistant:
1. Provider model dropdown only shows one option.
2. Chat context appears location-centric instead of full-trip/collection-centric.
3. Suggested prompts still assume a single location instead of itinerary-wide planning.
## Tasks
- [x] **F1 — Expand model options for OpenCode Zen provider**
- **Acceptance criteria**:
- Model dropdown offers multiple valid options for `opencode_zen` (not just one hardcoded value).
- Options are sourced in a maintainable way (backend-side).
- Selecting an option is sent through existing `model` override path.
- **Agent**: explorer → coder → reviewer → tester
- **Dependencies**: discovery of current `/api/chat/providers/{id}/models/` behavior.
- **Workstream**: `main` (follow-up bugfix set)
- **Implementation note (2026-03-09)**: Updated `ChatProviderCatalogViewSet.models()` in `backend/server/chat/views/__init__.py` to return a curated multi-model list for `opencode_zen` (OpenAI + Anthropic options), excluding `openai/o1-preview` and `openai/o1-mini` per critic guardrail.
- [x] **F2 — Correct chat context to reflect full trip/collection**
- **Acceptance criteria**:
- Assistant guidance/prompt context emphasizes full collection itinerary and date window.
- Tool calls for planning are grounded in trip-level context (not only one location label).
- No regression in existing collection-context fields.
- **Agent**: explorer → coder → reviewer → tester
- **Dependencies**: discovery of system prompt + tool context assembly.
- **Workstream**: `main`
- **Implementation note (2026-03-09)**: Updated frontend `deriveCollectionDestination()` to summarize unique itinerary stops (city/country-first with fallback names, compact cap), enriched backend `send_message()` trip context with collection-derived multi-stop itinerary data from `collection.locations`, and added explicit system prompt guidance to treat collection chats as trip-level and call `get_trip_details` before location search when additional context is needed.
- [x] **F3 — Make suggested prompts itinerary-centric**
- **Acceptance criteria**:
- Quick-action prompts no longer require/assume a single destination.
- Prompts read naturally for multi-city/multi-country collections.
- **Agent**: explorer → coder → reviewer → tester
- **Dependencies**: discovery of prompt rendering logic in `AITravelChat.svelte`.
- **Workstream**: `main`
- **Implementation note (2026-03-09)**: Updated `AITravelChat.svelte` quick-action guard to use `collectionName || destination` context and itinerary-focused wording for Restaurants/Activities prompts; fixed `search_places` tool result parsing by changing `.places` reads to backend-aligned `.results` in both `hasPlaceResults()` and `getPlaceResults()`, restoring place-card rendering and Add-to-Itinerary actions.
## Notes
- User-provided trace in `agent-interaction.txt` indicates location-heavy responses and a `{"error":"location is required"}` tool failure during itinerary add flow.
---
## Discovery Findings
### F1 — Model dropdown shows only one option
**Root cause**: `backend/server/chat/views/__init__.py` lines 417418, `ChatProviderCatalogViewSet.models()`:
```python
if provider in ["opencode_zen"]:
return Response({"models": ["openai/gpt-5-nano"]})
```
The `opencode_zen` branch returns a single-element list. All other non-matched providers fall to `return Response({"models": []})` (line 420).
**Frontend loading path** (`AITravelChat.svelte` lines 115142, `loadModelsForProvider()`):
- `GET /api/chat/providers/{provider}/models/` → sets `availableModels = data.models`.
- When the list has exactly one item, the dropdown shows only that item (correct DaisyUI `<select>`, lines 599613).
- `availableModels.length === 0` → shows a single "Default" option (line 607), so both the zero-model and one-model paths surface as a one-option dropdown.
**Also**: The `models` endpoint (line 339426) requires an API key and returns HTTP 403 if absent; the frontend silently sets `availableModels = []` on any non-OK response (line 136138) — so users without a key see "Default" only, regardless of provider.
**Edit point**:
- `backend/server/chat/views/__init__.py` lines 417418: expand `opencode_zen` model list to include Zen-compatible models (e.g., `openai/gpt-5-nano`, `openai/gpt-4o-mini`, `openai/gpt-4o`, `anthropic/claude-3-5-haiku-20241022`).
- Optionally: `AITravelChat.svelte` `loadModelsForProvider()` — handle non-OK response more gracefully (log distinct error instead of silent fallback to empty).
---
### F2 — Context appears location-centric, not trip-centric
**Root cause — `destination` prop is a single derived location string**:
`frontend/src/routes/collections/[id]/+page.svelte` lines 259278, `deriveCollectionDestination()`:
```ts
const firstLocation = current.locations.find(...)
return `${cityName}, ${countryName}` // first location only
```
Only the **first** location in `collection.locations` is used. Multi-city trips surface a single city/country string.
**How it propagates** (`+page.svelte` lines 12871294):
```svelte
<AITravelChat
destination={collectionDestination} // single-location string
...
/>
```
**Backend trip context** (`backend/server/chat/views/__init__.py` lines 144168, `send_message`):
```python
context_parts = []
if collection_name: context_parts.append(f"Trip: {collection_name}")
if destination: context_parts.append(f"Destination: {destination}") # ← single string
if start_date and end_date: context_parts.append(f"Dates: ...")
system_prompt += "\n\n## Trip Context\n" + "\n".join(context_parts)
```
The `Destination:` line is a single string from the frontend — no multi-stop awareness. The `collection` object IS fetched from DB (lines 152164) and passed to `get_system_prompt(user, collection)`, but `get_system_prompt` (`llm_client.py` lines 310358) only uses `collection` to decide single-user vs. party preferences — it never reads collection locations, itinerary, or dates from the collection model itself.
**Edit points**:
1. `frontend/src/routes/collections/[id]/+page.svelte` `deriveCollectionDestination()` (lines 259278): Change to derive a multi-location string (e.g., comma-joined list of unique city/country pairs, capped at 45) rather than first-only. Or rename to make clear it's itinerary-wide and return `undefined` when collection has many diverse destinations.
2. `backend/server/chat/views/__init__.py` `send_message()` (lines 144168): Since `collection` is already fetched, enrich `context_parts` directly from `collection.locations` (unique cities/countries) rather than relying solely on the single-string `destination` param.
3. Optionally, `backend/server/chat/llm_client.py` `get_system_prompt()` (lines 310358): When `collection` is not None, add a collection-derived section to the base prompt listing all itinerary destinations and dates from the collection object.
---
### F3 — Quick-action prompts assume a single destination
**Root cause — all destination-dependent prompts are gated on `destination` prop** (`AITravelChat.svelte` lines 766804):
```svelte
{#if destination}
<button>🍽️ Restaurants in {destination}</button>
<button>🎯 Activities in {destination}</button>
{/if}
{#if startDate && endDate}
<button>🎒 Packing tips for {startDate} to {endDate}</button>
{/if}
<button>📅 Itinerary help</button> ← always shown, generic
```
The "Restaurants" and "Activities" buttons are hidden when no `destination` is derived (multi-city trip with no single dominant location), and their prompt strings hard-code `${destination}` — a single-city reference. They also don't reference the collection name or multi-stop nature.
**Edit points** (`AITravelChat.svelte` lines 766804):
1. Replace `{#if destination}` guard for restaurant/activity buttons with a `{#if collectionName || destination}` guard.
2. Change prompt strings to use `collectionName` as primary context, falling back to `destination`:
- `What are the best restaurants for my trip to ${collectionName || destination}?`
- `What activities are there across my ${collectionName} itinerary?`
3. Add a "Budget" or "Transport" quick action that references the collection dates + itinerary scope (doesn't need `destination`).
4. The "📅 Itinerary help" button (line 797804) sends `'Can you help me plan a day-by-day itinerary for this trip?'` — already collection-neutral; no change needed.
5. Packing tip prompt (lines 788795) already uses `startDate`/`endDate` without `destination` — this one is already correct.
---
### Cross-cutting risk: `destination` prop semantics are overloaded
The `destination` prop in `AITravelChat.svelte` is used for:
- Header subtitle display (line 582: removed in current code — subtitle block gone)
- Quick-action prompt strings (lines 771, 779)
- `send_message` payload (line 268: `destination`)
Changing `deriveCollectionDestination()` to return a multi-location string affects all three uses. The header display is currently suppressed (no `{destination}` in the HTML header block after WS4-F4 changes), so that's safe. The `send_message` backend receives it as the `Destination:` context line, which is acceptable for a multi-city string.
### No regression surface from `loadModelsForProvider` reactive trigger
The `$: if (selectedProvider) { void loadModelsForProvider(); }` reactive statement (line 190192) fires whenever `selectedProvider` changes. Expanding the `opencode_zen` model list won't affect other providers. The `loadModelPref`/`saveModelPref` localStorage path is independent of model list size.
### `add_to_itinerary` tool `location` required error (from Notes)
`search_places` tool (`agent_tools.py`) requires a `location` string param. When the LLM calls it with no location (because context only mentions a trip name, not a geocodable string), the tool returns `{"error": "location is required"}`. This is downstream of F2 — fixing the context so the LLM receives actual geocodable location strings will reduce these errors, but the tool itself should also be documented as requiring a geocodable string.
---
## Deep-Dive Findings (explorer pass 2 — 2026-03-09)
### F1: Exact line for single-model fix
`backend/server/chat/views/__init__.py` **lines 417418**:
```python
if provider in ["opencode_zen"]:
return Response({"models": ["openai/gpt-5-nano"]})
```
Single-entry hard-coded list. No Zen API call is made. Expand to all Zen-compatible models.
**Recommended minimal list** (OpenAI-compatible pass-through documented for Zen):
```python
return Response({"models": [
"openai/gpt-5-nano",
"openai/gpt-4o-mini",
"openai/gpt-4o",
"openai/o1-preview",
"openai/o1-mini",
"anthropic/claude-sonnet-4-20250514",
"anthropic/claude-3-5-haiku-20241022",
]})
```
---
### F2: System prompt never injects collection locations into context
`backend/server/chat/views/__init__.py` lines **144168** (`send_message`): `collection` is fetched from DB but only passed to `get_system_prompt()` for preference aggregation — its `.locations` queryset is never read to enrich context.
`backend/server/chat/llm_client.py` lines **310358** (`get_system_prompt`): `collection` param only used for `shared_with` preference branch. Zero use of `collection.locations`, `.start_date`, `.end_date`, or `.itinerary_items`.
**Minimal fix — inject into context_parts in `send_message`**:
After line 164 (`collection = requested_collection`), add:
```python
if collection:
loc_names = list(collection.locations.values_list("name", flat=True)[:8])
if loc_names:
context_parts.append(f"Locations in this trip: {', '.join(loc_names)}")
```
Also strengthen the base system prompt in `llm_client.py` to instruct the model to call `get_trip_details` when operating in collection context before calling `search_places`.
---
### F3a: Frontend `hasPlaceResults` / `getPlaceResults` use wrong key `.places` — cards never render
**Critical bug**`AITravelChat.svelte`:
- **Line 377**: checks `(result.result as { places?: unknown[] }).places` — should be `results`
- **Line 386**: returns `(result.result as { places: any[] }).places` — should be `results`
Backend `search_places` (`agent_tools.py` line 188192) returns:
```python
return {"location": location_name, "category": category, "results": results}
```
The key is `results`, not `places`. Because `hasPlaceResults` always returns `false`, the "Add to Itinerary" button on place cards is **never rendered** for any real tool output. The `<pre>` JSON fallback block shows instead.
**Minimal fix**: change both `.places` references → `.results` in `AITravelChat.svelte` lines 377 and 386.
---
### F3b: `{"error": "location is required"}` origin
`backend/server/chat/agent_tools.py` **line 128**:
```python
if not location_name:
return {"error": "location is required"}
```
Triggered when LLM calls `search_places({})` with no `location` argument — which happens when the system prompt only contains a non-geocodable trip name (e.g., `Destination: Rome Trip 2025`) without actual city/place strings.
This error surfaces in the SSE stream → rendered as a tool result card with `{"error": "..."}` text.
**Fix**: Resolved by F2 (richer context); also improve guard message to be user-safe: `"Please provide a location or city name to search near."`.
---
### Summary of edit points
| Issue | File | Lines | Change |
|---|---|---|---|
| F1: expand opencode_zen models | `backend/server/chat/views/__init__.py` | 417418 | Replace 1-item list with 7-item list |
| F2: inject collection locations | `backend/server/chat/views/__init__.py` | 144168 | Add `loc_names` context_parts after line 164 |
| F2: reinforce system prompt | `backend/server/chat/llm_client.py` | 314332 | Add guidance to use `get_trip_details` in collection context |
| F3a: fix `.places``.results` | `frontend/src/lib/components/AITravelChat.svelte` | 377, 386 | Two-char key rename |
| F3b: improve error guard | `backend/server/chat/agent_tools.py` | 128 | Better user-safe message (optional) |
---
## Critic Gate
- **Verdict**: APPROVED
- **Date**: 2026-03-09
- **Reviewer**: critic agent
### Assumption Challenges
1. **F2 `values_list("name")` may not produce geocodable strings**`Location.name` can be opaque (e.g., "Eiffel Tower"). Mitigated: plan already proposes system prompt guidance to call `get_trip_details` first. Enhancement: use `city__name`/`country__name` in addition to `name` for the injected context.
2. **F3a `.places` vs `.results` key mismatch** — confirmed real bug. `agent_tools.py` returns `results` key; frontend checks `places`. Place cards never render. Two-char fix validated.
### Execution Guardrails
1. **Sequencing**: F1 (independent) → F2 (context enrichment) → F3 (prompts + `.places` fix). F3 depends on F2's `deriveCollectionDestination` changes.
2. **F1 model list**: Exclude `openai/o1-preview` and `openai/o1-mini` — reasoning models may not support tool-use in streaming chat. Verify compatibility before including.
3. **F2 context injection**: Use `select_related('city', 'country')` or `values_list('name', 'city__name', 'country__name')` — bare `name` alone is insufficient for geocoding context.
4. **F3a is atomic**: The `.places``.results` fix is a standalone bug, separate from prompt wording changes. Can bundle in F3's review cycle.
5. **Quality pipeline**: Each fix gets reviewer + tester pass. No batch validation.
6. **Functional verification required**: (a) model dropdown shows multiple options, (b) chat context includes multi-city info, (c) quick-action prompts render for multi-location collections, (d) search result place cards actually render (F3a).
7. **Decomposition**: Single workstream appropriate — tightly coupled bugfixes in same component/view pair, not independent services.
---
## F1 Review
- **Verdict**: APPROVED (score 0)
- **Lens**: Correctness
- **Date**: 2026-03-09
- **Reviewer**: reviewer agent
**Scope**: `backend/server/chat/views/__init__.py` lines 417428 — `opencode_zen` model list expanded from 1 to 5 entries.
**Findings**: No CRITICAL or WARNING issues. Change is minimal and correctly scoped.
**Verified**:
- Critic guardrail followed: `o1-preview` and `o1-mini` excluded (reasoning models, no streaming tool-use).
- All 5 model IDs use valid LiteLLM `provider/model` format; `anthropic/*` IDs match exact entries in Anthropic branch.
- `_is_model_override_compatible()` bypasses prefix check for `api_base` gateways — all IDs pass validation.
- No regression in other provider branches (openai, anthropic, gemini, groq, ollama) — all untouched.
- Frontend `loadModelsForProvider()` handles multi-item arrays correctly; dropdown will show all 5 options.
- localStorage model persistence unaffected by list size change.
**Suggestion**: Add inline comment on why o1-preview/o1-mini are excluded to prevent future re-addition.
**Reference**: See [Critic Gate](#critic-gate), [decisions.md](../decisions.md#critic-gate-travel-agent-context--models-follow-up)
---
## F1 Test
- **Verdict**: PASS (Standard + Adversarial)
- **Date**: 2026-03-09
- **Tester**: tester agent
### Commands run
| # | Command | Exit code | Output |
|---|---|---|---|
| 1 | `docker compose exec server python3 -m py_compile /code/chat/views/__init__.py` | 0 | (no output — syntax OK) |
| 2 | Inline `python3 -c` assertion of `opencode_zen` branch | 0 | count: 5, all 5 model IDs confirmed present, PASS |
| 3 | Adversarial: branch isolation for 8 non-`opencode_zen` providers | 0 | All return `[]`, ADVERSARIAL PASS |
| 4 | Adversarial: critic guardrail + LiteLLM format check | 0 | `o1-preview` / `o1-mini` absent; all IDs in `provider/model` format, PASS |
| 5 | `docker compose exec server python3 -c "import chat.views; ..."` | 0 | Module import OK, `ChatProviderCatalogViewSet.models` action present |
| 6 | `docker compose exec server python3 manage.py test --verbosity=1 --keepdb` | 1 (pre-existing) | 30 tests: 24 pass, 1 fail, 5 errors — identical to known baseline (2 user email key + 4 geocoding mock). **Zero new failures.** |
### Key findings
- `opencode_zen` branch now returns exactly 5 models: `openai/gpt-5-nano`, `openai/gpt-4o-mini`, `openai/gpt-4o`, `anthropic/claude-sonnet-4-20250514`, `anthropic/claude-3-5-haiku-20241022`.
- Critic guardrail respected: `openai/o1-preview` and `openai/o1-mini` absent from list.
- All model IDs use valid `provider/model` format compatible with LiteLLM routing.
- No other provider branches affected.
- No regression in full Django test suite beyond pre-existing baseline.
### Adversarial attempts
- **Case insensitive match (`OPENCODE_ZEN`)**: does not match branch → returns `[]` (correct; exact case match required).
- **Partial match (`opencode_zen_extra`)**: does not match → returns `[]` (correct; no prefix leakage).
- **Empty string provider `""`**: returns `[]` (correct).
- **`openai/o1-preview` inclusion check**: absent from list (critic guardrail upheld).
- **`openai/o1-mini` inclusion check**: absent from list (critic guardrail upheld).
### MUTATION_ESCAPES: 0/4
All critical branch mutations checked: wrong provider name, case variation, extra-suffix variation, empty string — all correctly return `[]`. The 5-model list is hard-coded so count drift would be immediately caught by assertion.
### LESSON_CHECKS
- Pre-existing test failures (2 user + 4 geocoding) — **confirmed**, baseline unchanged.
---
## F2 Review
- **Verdict**: APPROVED (score 0)
- **Lens**: Correctness
- **Date**: 2026-03-09
- **Reviewer**: reviewer agent
**Scope**: F2 — Correct chat context to reflect full trip/collection. Three files changed:
- `frontend/src/routes/collections/[id]/+page.svelte` (lines 259300): `deriveCollectionDestination()` rewritten from first-location-only to multi-stop itinerary summary.
- `backend/server/chat/views/__init__.py` (lines 166199): `send_message()` enriched with collection-derived `Itinerary stops:` context from `collection.locations`.
- `backend/server/chat/llm_client.py` (lines 333336): System prompt updated with trip-level reasoning guidance and `get_trip_details`-first instruction.
**Acceptance criteria verified**:
1. ✅ Frontend derives multi-stop destination string (unique city/country pairs, capped at 4, semicolon-joined, `+N more` overflow).
2. ✅ Backend enriches system prompt with `Itinerary stops:` from collection locations (up to 8, `select_related('city', 'country')` for efficiency).
3. ✅ System prompt instructs trip-level reasoning and `get_trip_details`-first behavior (tool confirmed to exist in `agent_tools.py`).
4. ✅ No regression: non-collection chats, single-location collections, and empty-location collections all handled correctly via guard conditions.
**Findings**: No CRITICAL or WARNING issues. Two minor suggestions (dead guard on line 274 of `+page.svelte`; undocumented cap constant in `views/__init__.py` line 195).
**Prior guidance**: Critic gate recommendation to use `select_related('city', 'country')` and city/country names — confirmed followed.
**Reference**: See [Critic Gate](#critic-gate), [F1 Review](#f1-review)
---
## F2 Test
- **Verdict**: PASS (Standard + Adversarial)
- **Date**: 2026-03-09
- **Tester**: tester agent
### Commands run
| # | Command | Exit code | Output summary |
|---|---|---|---|
| 1 | `bun run check` (frontend) | 0 | 0 errors, 6 warnings — all 6 are pre-existing in `CollectionRecommendationView.svelte` + `RegionCard.svelte`; no new issues from F2 changes |
| 2 | `docker compose exec server python3 -m py_compile /code/chat/views/__init__.py` | 0 | Syntax OK |
| 3 | `docker compose exec server python3 -m py_compile /code/chat/llm_client.py` | 0 | Syntax OK |
| 4 | Backend functional enrichment test (mock collection, 6 inputs → 5 unique stops) | 0 | `Itinerary stops: Rome, Italy; Florence, Italy; Venice, Italy; Switzerland; Eiffel Tower` — multi-stop line confirmed |
| 5 | Adversarial backend: 7 cases (cap-8, empty, all-blank, whitespace, unicode, dedup-12, None city) | 0 | All 7 PASS |
| 6 | Frontend JS adversarial: 7 cases (multi-stop, single, null, empty, overflow +N, fallback, all-blank) | 0 | All 7 PASS |
| 7 | System prompt phrase check | 0 | `itinerary-wide` + `get_trip_details` + `Treat context as itinerary-wide` all confirmed present |
| 8 | `docker compose exec server python3 manage.py test --verbosity=1 --keepdb` | 1 (pre-existing) | 30 tests: 24 pass, 1 fail, 5 errors — **identical to known baseline**; zero new failures |
### Acceptance criteria verdict
| Criterion | Result | Evidence |
|---|---|---|
| Multi-stop destination string derived in frontend | ✅ PASS | JS test: 3-city collection → `Rome, Italy; Florence, Italy; Venice, Italy`; 6-city → `A, X; B, X; C, X; D, X; +2 more` |
| Backend injects `Itinerary stops:` from `collection.locations` | ✅ PASS | Python test: 6 inputs → 5 unique stops joined with `; `, correctly prefixed `Itinerary stops:` |
| System prompt has trip-level + `get_trip_details`-first guidance | ✅ PASS | `get_system_prompt()` output contains `itinerary-wide`, `get_trip_details first`, `Treat context as itinerary-wide` |
| No regression in existing fields | ✅ PASS | Django test suite unchanged at baseline (24 pass, 6 pre-existing fail/error) |
### Adversarial attempts
| Hypothesis | Test | Expected failure signal | Observed |
|---|---|---|---|
| 12-city collection exceeds cap | Supply 12 unique cities | >8 stops returned | Capped at exactly 8 ✅ |
| Empty `locations` list | Pass `locations=[]` | Crash or non-empty result | Returns `undefined`/`[]` cleanly ✅ |
| All-blank location entries | All city/country/name empty or whitespace | Non-empty or crash | All skipped, returns `undefined`/`[]` ✅ |
| Whitespace-only city/country | `city.name=' '` with valid fallback | Whitespace treated as valid | Strip applied, fallback used ✅ |
| Unicode city names | `東京`, `Zürich`, `São Paulo` | Encoding corruption or skip | All 3 preserved correctly ✅ |
| 12 duplicate identical entries | Same city×12 | Multiple copies in output | Deduped to exactly 1 ✅ |
| `city.name = None` (DB null) | `None` city name, valid country | `AttributeError` or crash | Handled via `or ''` guard, country used ✅ |
| `null` collection passed to frontend func | `deriveCollectionDestination(null)` | Crash | Returns `undefined` cleanly ✅ |
| Overflow suffix formatting | 6 unique stops, maxStops=4 | Wrong suffix or missing | `+2 more` suffix correct ✅ |
| Fallback name path | No city/country, `location='Eiffel Tower'` | Missing or wrong label | `Eiffel Tower` used ✅ |
### MUTATION_ESCAPES: 0/6
Mutation checks applied:
1. `>= 8` cap mutated to `> 8` → A1 test (12-city produces 8, not 9) would catch.
2. `seen_stops` dedup check mutated to always-false → A6 test (12-dupes) would catch.
3. `or ''` null-guard on `city.name` removed → A7 test would catch `AttributeError`.
4. `if not fallback_name: continue` removed → A3 test (all-blank) would catch spurious entries.
5. `stops.slice(0, maxStops).join('; ')` separator mutated to `', '` → Multi-stop tests check for `'; '` as separator.
6. `return undefined` on empty guard mutated to `return ''` → A4 empty-locations test checks `=== undefined`.
All 6 mutations would be caught by existing test cases.
### LESSON_CHECKS
- Pre-existing test failures (2 user email key + 4 geocoding mock) — **confirmed**, baseline unchanged.
- F2 context enrichment using `select_related('city', 'country')` per critic guardrail — **confirmed** (line 169171 of views/__init__.py).
- Fallback to `location`/`name` fields when geo data absent — **confirmed** working via A4/A5 tests.
**Reference**: See [F2 Review](#f2-review), [Critic Gate](#critic-gate)
---
## F3 Review
- **Verdict**: APPROVED (score 0)
- **Lens**: Correctness
- **Date**: 2026-03-09
- **Reviewer**: reviewer agent
**Scope**: Targeted re-review of two F3 findings in `frontend/src/lib/components/AITravelChat.svelte`:
1. `.places``.results` key mismatch in `hasPlaceResults()` / `getPlaceResults()`
2. Quick-action prompt guard and wording — location-centric → itinerary-centric
**Finding 1 — `.places` → `.results` (RESOLVED)**:
- `hasPlaceResults()` (line 378): checks `(result.result as { results?: unknown[] }).results`
- `getPlaceResults()` (line 387): returns `(result.result as { results: any[] }).results`
- Cross-verified against backend `agent_tools.py:188-191`: `return {"location": ..., "category": ..., "results": results}` — keys match.
**Finding 2 — Itinerary-centric prompts (RESOLVED)**:
- New reactive `promptTripContext` (line 72): `collectionName || destination || ''` — prefers collection name over single destination.
- Guard changed from `{#if destination}``{#if promptTripContext}` (line 768) — buttons now visible for named collections even without a single derived destination.
- Prompt strings use `across my ${promptTripContext} itinerary?` wording (lines 773, 783) — no longer implies single location.
- No impact on packing tips (still `startDate && endDate` gated) or itinerary help (always shown).
**No introduced issues**: `promptTripContext` always resolves to string; template interpolation safe; existing tool result rendering and `sendMessage()` logic unchanged beyond the key rename.
**SUGGESTIONS**: Minor indentation inconsistency between `{#if promptTripContext}` block (lines 768-789) and adjacent `{#if startDate}` block (lines 790-801) — cosmetic, `bun run format` should normalize.
**Reference**: See [Critic Gate](#critic-gate), [F2 Review](#f2-review), [decisions.md](../decisions.md#critic-gate-travel-agent-context--models-follow-up)
---
## F3 Test
- **Verdict**: PASS (Standard + Adversarial)
- **Date**: 2026-03-09
- **Tester**: tester agent
### Commands run
| # | Command | Exit code | Output summary |
|---|---|---|---|
| 1 | `bun run check` (frontend) | 0 | 0 errors, 6 warnings — all 6 pre-existing in `CollectionRecommendationView.svelte` + `RegionCard.svelte`; zero new issues from F3 changes |
| 2 | `bun run f3_test.mjs` (functional simulation) | 0 | 20 assertions: S1S6 standard + A1A6 adversarial + PTC1PTC4 promptTripContext + prompt wording — ALL PASSED |
### Acceptance criteria verdict
| Criterion | Result | Evidence |
|---|---|---|
| `.places``.results` key fix in `hasPlaceResults()` | ✅ PASS | S1: `{results:[...]}` → true; S2: `{places:[...]}` → false (old key correctly rejected) |
| `.places``.results` key fix in `getPlaceResults()` | ✅ PASS | S1: returns 2-item array from `.results`; S2: returns `[]` on `.places` key |
| Old `.places` key no longer triggers card rendering | ✅ PASS | S2 regression guard: `hasPlaceResults({places:[...]})` → false |
| `promptTripContext` = `collectionName \|\| destination \|\| ''` | ✅ PASS | PTC1PTC4: collectionName wins; falls back to destination; empty string when both absent |
| Quick-action guard is `{#if promptTripContext}` | ✅ PASS | Source inspection confirmed line 768 uses `promptTripContext` |
| Prompt wording is itinerary-centric | ✅ PASS | Both prompts contain `itinerary`; neither uses single-location "in X" wording |
### Adversarial attempts
| Hypothesis | Test design | Expected failure signal | Observed |
|---|---|---|---|
| `results` is a string, not array | `result: { results: 'not-array' }` | `Array.isArray` fails → false | false ✅ |
| `results` is null | `result: { results: null }` | `Array.isArray(null)` false | false ✅ |
| `result.result` is a number | `result: 42` | typeof guard rejects | false ✅ |
| `result.result` is a string | `result: 'str'` | typeof guard rejects | false ✅ |
| Both `.places` and `.results` present | both keys in result | Must use `.results` | `getPlaceResults` returns `.results` item ✅ |
| `results` is an object `{foo:'bar'}` | not an array | `Array.isArray` false | false ✅ |
| `promptTripContext` with empty collectionName string | `'' \|\| 'London' \|\| ''` | Should fall through to destination | 'London' ✅ |
### MUTATION_ESCAPES: 0/5
Mutation checks applied:
1. `result.result !== null` guard removed → S5 (null result) would crash `Array.isArray(null.results)` and be caught.
2. `Array.isArray(...)` replaced with truthy check → A1 (string results) test would catch.
3. `result.name === 'search_places'` removed → S4 (wrong tool name) would catch.
4. `.results` key swapped back to `.places` → S1 (standard payload) would return empty array, caught.
5. `collectionName || destination` order swapped → PTC1 test would return wrong value, caught.
All 5 mutations would be caught by existing assertions.
### LESSON_CHECKS
- `.places` vs `.results` key mismatch (F3a critical bug from discovery) — **confirmed fixed**: S1 passes with `.results`; S2 regression guard confirms `.places` no longer triggers card rendering.
- Pre-existing 6 svelte-check warnings — **confirmed**, no new warnings introduced.
---
## Completion Summary
- **Status**: ALL COMPLETE (F1 + F2 + F3)
- **Date**: 2026-03-09
- **All tasks**: Implemented, reviewed (APPROVED score 0), and tested (PASS standard + adversarial)
- **Zero regressions**: Frontend 0 errors / 6 pre-existing warnings; backend 24/30 pass (6 pre-existing failures)
- **Files changed**:
- `backend/server/chat/views/__init__.py` — F1 (model list expansion) + F2 (itinerary stops context injection)
- `backend/server/chat/llm_client.py` — F2 (system prompt trip-level guidance)
- `frontend/src/routes/collections/[id]/+page.svelte` — F2 (multi-stop `deriveCollectionDestination`)
- `frontend/src/lib/components/AITravelChat.svelte` — F3 (itinerary-centric prompts + `.results` key fix)
- **Knowledge recorded**: [knowledge.md](../knowledge.md#multi-stop-context-derivation-f2-follow-up) (multi-stop context, quick prompts, search_places key convention, opencode_zen model list)
- **Decisions recorded**: [decisions.md](../decisions.md#critic-gate-travel-agent-context--models-follow-up) (critic gate)
- **AGENTS.md updated**: Chat model override pattern (dropdown) + chat context pattern added
---
## Discovery: runtime failures (2026-03-09)
Explorer investigation of three user-trace errors against the complete scoped file set.
### Error 1 — "The model provider rate limit was reached"
**Exact origin**: `backend/server/chat/llm_client.py` **lines 128132** (`_safe_error_payload`):
```python
if isinstance(exc, rate_limit_cls):
return {
"error": "The model provider rate limit was reached. Please wait and try again.",
"error_category": "rate_limited",
}
```
The user-trace text `"model provider rate limit was reached"` is a substring of this exact message. This is **not a bug** — it is the intended sanitized error surface for `litellm.exceptions.RateLimitError`. The error is raised by LiteLLM when the upstream provider (OpenAI, Anthropic, etc.) returns HTTP 429, and `_safe_error_payload()` converts it to this user-safe string. The SSE error payload is then propagated through `stream_chat_completion` (line 457) → `event_stream()` in `send_message` (line 256: `if data.get("error"): encountered_error = True; break`) → yielded to frontend → frontend SSE loop sets `assistantMsg.content = parsed.error` (line 307 of `AITravelChat.svelte`).
**Root cause of rate limiting itself**: Most likely `openai/gpt-5-nano` as the `opencode_zen` default model, or the user's provider hitting quota. No code fix required — this is provider-side throttling surfaced correctly. However, if the `opencode_zen` provider is being mistakenly routed to OpenAI's public endpoint instead of `https://opencode.ai/zen/v1`, it would exhaust a real OpenAI key rather than Zen. See Risk 1 below.
**No auth/session issue involved** — the error path reaches LiteLLM, meaning auth already succeeded up to the LLM call.
---
### Error 2 — `{"error":"location is required"}`
**Exact origin**: `backend/server/chat/agent_tools.py` **line 128**:
```python
if not location_name:
return {"error": "location is required"}
```
Triggered when LLM calls `search_places({})` or `search_places({"category": "food"})` with no `location` argument. This happens when the system prompt's trip context does not give the model a geocodable string — the model knows a "trip name" but not a city/country, so it calls `search_places` without a location.
**Current state (post-F2)**: The F2 fix injects `"Itinerary stops: Rome, Italy; ..."` into the system prompt from `collection.locations` **only when `collection_id` is supplied and resolves to an authorized collection**. If `collection_id` is missing from the frontend payload OR if the collection has locations with no `city`/`country` FK and no `location`/`name` fallback, the context_parts will still have only the `destination` string.
**Residual trigger path** (still reachable after F2):
- `collection_id` not sent in `send_message` payload → collection never fetched → `context_parts` has only `Destination: <multi-stop string>` → LLM picks a trip-name string like "Italy 2025" as its location arg → `search_places(location="Italy 2025")` succeeds (geocoding finds "Italy") OR model sends `search_places({})` → error returned.
- OR: `collection_id` IS sent, all locations have no `city`/`country` AND `location` field is blank AND `name` is not geocodable (e.g., `"Hotel California"`) → `itinerary_stops` list is empty → no `Itinerary stops:` line injected.
**Second remaining trigger**: `get_trip_details` fails (Collection.DoesNotExist or exception) → returns `{"error": "An unexpected error occurred while fetching trip details"}` → model falls back to calling `search_places` without a location derived from context.
---
### Error 3 — `{"error":"An unexpected error occurred while fetching trip details"}`
**Exact origin**: `backend/server/chat/agent_tools.py` **lines 394396** (`get_trip_details`):
```python
except Exception:
logger.exception("get_trip_details failed")
return {"error": "An unexpected error occurred while fetching trip details"}
```
**Root cause — `get_trip_details` uses owner-only filter**: `agent_tools.py` **line 317**:
```python
collection = (
Collection.objects.filter(user=user)
...
.get(id=collection_id)
)
```
This uses `filter(user=user)`**shared collections are excluded**. If the logged-in user is a shared member (not the owner) of the collection, `Collection.DoesNotExist` is raised, falls to the outer `except Exception`, and returns the generic error. However, `Collection.DoesNotExist` is caught specifically on **line 392** and returns `{"error": "Trip not found"}`, not the generic message. So the generic error can only come from a genuine Python exception inside the try block — most likely:
1. **`item.item` AttributeError** — `CollectionItineraryItem` uses a `GenericForeignKey`; if the referenced object has been deleted, `item.item` returns `None` and `getattr(None, "name", "")` would return `""` (safe, not an error) — so this is not the cause.
2. **`collection.itinerary_items` reverse relation** — if the `related_name="itinerary_items"` is not defined on `CollectionItineraryItem.collection` FK, the queryset call raises `AttributeError`. Checking `adventures/models.py` line 716: `related_name="itinerary_items"` is present — so this is not the cause.
3. **`collection.transportation_set` / `collection.lodging_set`** — if `Transportation` or `Lodging` doesn't have `related_name` defaulting to `transportation_set`/`lodging_set`, these would fail. This is the **most likely cause** — Django only auto-creates `_set` accessors with the model name in lowercase; `transportation_set` requires that the FK `related_name` is either set or left as default `transportation_set`. Need to verify model definition.
4. **`collection.start_date.isoformat()` on None** — guarded by `if collection.start_date` (line 347) — safe.
**Verified**: `Transportation.collection` (`models.py:332`) and `Lodging.collection` (`models.py:570`) are both ForeignKeys with **no `related_name`**, so Django auto-assigns `transportation_set` and `lodging_set` — the accessors used in `get_trip_details` lines 375/382 are correct. These do NOT cause the error.
**Actual culprit**: The `except Exception` at line 394 catches everything. Any unhandled exception inside the try block (e.g., a `prefetch_related("itinerary_items__content_type")` failure if a content_type row is missing, or a `date` field deserialization error on a malformed DB record) results in the generic error. Most commonly, the issue is the **shared-user access gap**: `Collection.objects.filter(user=user).get(id=...)` raises `Collection.DoesNotExist` for shared users, but that is caught by the specific handler at line 392 as `{"error": "Trip not found"}`, NOT the generic message. The generic message therefore indicates a true runtime Python exception somewhere inside the try body.
**Additionally**: the shared-collection access gap means `get_trip_details` returns `{"error": "Trip not found"}` (not the generic error) for shared users — this is a separate functional bug where shared users cannot use the AI tool on their shared trips.
---
### Authentication / CSRF in Chat Calls
**Verdict: Auth is working correctly for the SSE path. No auth failure in the reported errors.**
Evidence:
1. **Proxy path** (`frontend/src/routes/api/[...path]/+server.ts`):
- `POST` to `send_message` goes through `handleRequest()` (line 16) with `requreTrailingSlash=true`.
- On every proxied request: proxy deletes old `csrftoken` cookie, calls `fetchCSRFToken()` to get a fresh token from `GET /csrf/`, then sets `X-CSRFToken` header and reconstructs the `Cookie` header with `csrftoken=<new>; sessionid=<from-browser>` (lines 5775).
- SSE streaming: `content-type: text/event-stream` is detected (line 94) and the response body is streamed directly without buffering.
2. **Session**: `sessionid` cookie is extracted from browser cookies (line 66) and forwarded. `SESSION_COOKIE_SAMESITE=Lax` allows this.
3. **Rate-limit error is downstream of auth** — LiteLLM only fires if the Django view already authenticated the user and reached `stream_chat_completion`. A CSRF or session failure would return HTTP 403/401 before the SSE stream starts, and the frontend would hit the `if (!res.ok)` branch (line 273), not the SSE error path.
**One auth-adjacent gap**: `loadConversations()` (line 196) and `createConversation()` (line 203) do NOT include `credentials: 'include'` — but these go through the SvelteKit proxy which handles session injection server-side, so this is not a real failure point. The `send_message` fetch (line 258) also lacks explicit `credentials`, but again routes through the proxy.
**Potential auth issue — missing trailing slash for models endpoint**:
`loadModelsForProvider()` fetches `/api/chat/providers/${selectedProvider}/models/` (line 124) — this ends with `/` which is correct for the proxy's `requreTrailingSlash` logic. However, the proxy only adds a trailing slash for non-GET requests (it's applied to POST/PATCH/PUT/DELETE but not GET). Since `models/` is already in the URL, this is fine.
---
### Ranked Fixes by Impact
| Rank | Error | File | Line(s) | Fix |
|---|---|---|---|---|
| 1 (HIGH) | `get_trip_details` generic error | `backend/server/chat/agent_tools.py` | 316325 | Add `\| Q(shared_with=user)` to collection filter so shared users can call the tool; also add specific catches for known exception types before the bare `except Exception` |
| 2 (HIGH) | `{"error":"location is required"}` residual | `backend/server/chat/views/__init__.py` | 152164 | Ensure `collection_id` auth check also grants access for shared users (currently `shared_with.filter(id=request.user.id).exists()` IS present — ✅ already correct); verify `collection_id` is actually being sent from frontend on every `sendMessage` call |
| 2b (MEDIUM) | `search_places` called without location | `backend/server/chat/agent_tools.py` | 127128 | Improve error message to be user-instructional: `"Please provide a city or location name to search near."` — already noted in prior plan; also add `location` as a `required` field in the JSON schema so LLM is more likely to provide it |
| 3 (MEDIUM) | `transportation_set`/`lodging_set` crash | `backend/server/chat/agent_tools.py` | 370387 | Verify FK `related_name` values on Transportation/Lodging models; if wrong, correct the accessor names in `get_trip_details` |
| 4 (LOW) | Rate limiting | Provider config | N/A | No code fix — operational issue. Document that `opencode_zen` uses `https://opencode.ai/zen/v1` as `api_base` (already set in `CHAT_PROVIDER_CONFIG`) — ensure users aren't accidentally using a real OpenAI key with `opencode_zen` provider |
---
### Risks
1. **`get_trip_details` shared-user gap**: Shared users get `{"error": "Trip not found"}` — the LLM may then call `search_places` without the location context that `get_trip_details` would have provided, cascading into Error 2. Fix: add `| Q(shared_with=user)` to the collection filter at `agent_tools.py:317`.
2. **`transportation_set`/`lodging_set` reverse accessor names confirmed safe**: Django auto-generates `transportation_set` and `lodging_set` for the FKs (no `related_name` on `Transportation.collection` at `models.py:332` or `Lodging.collection` at `models.py:570`). These accessors work correctly. The generic error in `get_trip_details` must be from another exception path (e.g., malformed DB records, missing ContentType rows for deleted itinerary items, or the `prefetch_related` interaction on orphaned GFK references).
3. **`collection_id` not forwarded on all sends**: If `AITravelChat.svelte` is embedded without `collectionId` prop (e.g., standalone chat page), `collection_id` is `undefined` in the payload, the backend never fetches the collection, and no `Itinerary stops:` context is injected. The LLM then has no geocodable location data → calls `search_places` without `location`.
4. **`search_places` JSON schema marks `location` as required but `execute_tool` uses `filtered_kwargs`**: The tool schema (`agent_tools.py:103`) sets `"required": True` on `location`. However, `execute_tool` (line 619) passes only `filtered_kwargs` from the JSON-parsed `arguments` dict. If LLM sends `{}` (empty), `location=None` is the function default, not a schema-enforcement error. There is no server-side validation of required tool arguments — the required flag is only advisory to the LLM.
**See [decisions.md](../decisions.md) for critic gate context.**
---
## Research: Provider Strategy (2026-03-09)
**Full findings**: [research/provider-strategy.md](../research/provider-strategy.md)
### Verdict: Keep LiteLLM, Harden It
Replacing LiteLLM is not warranted. Every Voyage issue is in the integration layer (no retries, no capability checks, hardcoded models), not in LiteLLM itself. OpenCode's Python-equivalent IS LiteLLM — OpenCode uses Vercel AI SDK with ~20 bundled `@ai-sdk/*` provider packages, which is the TypeScript analogue.
### Architecture Options
| Option | Effort | Risk | Recommended? |
|---|---|---|---|
| **A. Keep LiteLLM, harden** (retry, tool-guard, metadata) | Low (1-2 sessions) | Low | ✅ YES |
| B. Hybrid: direct SDK for some providers | High (1-2 weeks) | High | No |
| C. Replace LiteLLM entirely | Very High (3-4 weeks) | Very High | No |
| D. LiteLLM Proxy sidecar | Medium (2-3 days) | Medium | Not yet — future multi-user |
### Immediate Code Fixes (4 items)
| # | Fix | File | Line(s) | Impact |
|---|---|---|---|---|
| 1 | Add `num_retries=2, request_timeout=60` to `litellm.acompletion()` | `llm_client.py` | 418 | Retry on rate-limit/timeout — biggest gap |
| 2 | Add `litellm.supports_function_calling(model=)` guard before passing tools | `llm_client.py` | ~397 | Prevents tool-call errors on incapable models |
| 3 | Return model objects with `supports_tools` metadata instead of bare strings | `views/__init__.py` | `models()` action | Frontend can warn/adapt per model capability |
| 4 | Replace hardcoded `model="gpt-4o-mini"` with provider config default | `day_suggestions.py` | 194 | Respects user's configured provider |
### Long-Term Recommendations
1. **Curated model registry** (YAML/JSON file like OpenCode's `models.dev`) with capabilities, costs, context limits — loaded at startup
2. **LiteLLM Proxy sidecar** — only if/when Voyage gains multi-user production deployment
3. **WSGI→ASGI migration** — long-term fix for event loop fragility (out of scope)
### Key Patterns Observed in Other Projects
- **No production project does universal runtime model discovery** — all use curated/admin-managed lists
- **Every production LiteLLM user has retry logic** — Voyage is the outlier with zero retries
- **Tool-call capability guards** are standard (`litellm.supports_function_calling()` used by PraisonAI, open-interpreter, mem0, ragbits, dspy)
- **Rate-limit resilience** ranges from simple `num_retries` to full `litellm.Router` with `RetryPolicy` and cross-model fallbacks