fix(chat): add saved AI defaults and harden suggestions
This commit is contained in:
2
.github/copilot-instructions.md
vendored
2
.github/copilot-instructions.md
vendored
@@ -16,7 +16,7 @@ Voyage is **pre-release** — not yet in production use. During pre-release:
|
||||
|
||||
**Key architectural pattern — API Proxy**: The frontend never calls the Django backend directly. All API calls go to `src/routes/api/[...path]/+server.ts`, which proxies requests to the Django server (`http://server:8000`), injecting CSRF tokens and managing session cookies. This means frontend fetches use relative URLs like `/api/locations/`.
|
||||
|
||||
**AI Chat**: The AI travel chat assistant is embedded in Collections → Recommendations (component: `AITravelChat.svelte`). There is no standalone `/chat` route. Chat providers are loaded dynamically from `GET /api/chat/providers/` (backed by LiteLLM runtime list + custom entries like `opencode_zen`). Chat conversations stream via SSE through `/api/chat/conversations/`. Provider config lives in `backend/server/chat/llm_client.py` (`CHAT_PROVIDER_CONFIG`). Chat composer supports per-provider model override via dropdown selector fed by `GET /api/chat/providers/{provider}/models/` (persisted in browser `localStorage` key `voyage_chat_model_prefs`). Collection chats inject multi-stop itinerary context and the system prompt guides `get_trip_details`-first reasoning. LiteLLM errors are mapped to sanitized user-safe messages via `_safe_error_payload()` (never exposes raw exception text).
|
||||
**AI Chat**: The AI travel chat assistant is embedded in Collections → Recommendations (component: `AITravelChat.svelte`). There is no standalone `/chat` route. Chat providers are loaded dynamically from `GET /api/chat/providers/` (backed by LiteLLM runtime list + custom entries like `opencode_zen`). Chat conversations stream via SSE through `/api/chat/conversations/`. Provider config lives in `backend/server/chat/llm_client.py` (`CHAT_PROVIDER_CONFIG`). Default AI provider/model saved via `UserAISettings` in DB (authoritative over browser localStorage). Chat composer supports per-provider model override via dropdown selector fed by `GET /api/chat/providers/{provider}/models/` (persisted in browser `localStorage` key `voyage_chat_model_prefs`). Collection chats inject multi-stop itinerary context and the system prompt guides `get_trip_details`-first reasoning. LiteLLM errors are mapped to sanitized user-safe messages via `_safe_error_payload()` (never exposes raw exception text). Invalid tool calls (missing required args) are detected and short-circuited with a user-visible error — not replayed into history.
|
||||
|
||||
**Services** (docker-compose):
|
||||
- `web` → SvelteKit frontend at `:8015`
|
||||
|
||||
298
.memory/decisions.md
Normal file
298
.memory/decisions.md
Normal file
@@ -0,0 +1,298 @@
|
||||
# Voyage — Decisions Log
|
||||
|
||||
## Fork from AdventureLog
|
||||
- **Decision**: Fork AdventureLog and rebrand as Voyage
|
||||
- **Rationale**: Build on proven foundation while adding itinerary UI, OSRM routing, LLM travel agent, lodging logic
|
||||
- **Date**: Project inception
|
||||
|
||||
## Docker-Only Backend Development
|
||||
- **Decision**: Backend development requires Docker; local Python pip install is not supported
|
||||
- **Rationale**: Complex GDAL/PostGIS dependencies; pip install fails with network timeouts
|
||||
- **Impact**: All backend commands run via `docker compose exec server`
|
||||
|
||||
## API Proxy Pattern
|
||||
- **Decision**: Frontend proxies all API calls through SvelteKit server routes
|
||||
- **Rationale**: Handles CSRF tokens and session cookies transparently; avoids CORS issues
|
||||
- **Reference**: See [knowledge/overview.md](knowledge/overview.md#api-proxy-pattern)
|
||||
|
||||
## Package Manager: Bun (Frontend)
|
||||
- **Decision**: Use Bun as frontend package manager (bun.lock present)
|
||||
- **Note**: npm scripts still used for build/lint/check commands
|
||||
|
||||
## Tooling Preference: Bun + uv
|
||||
- **Decision**: Prefer `bun` for frontend workflows and `uv` for Python workflows.
|
||||
- **Rationale**: User preference for faster, consistent package/runtime tooling.
|
||||
- **Operational rule**:
|
||||
- Frontend: use `bun install` and `bun run <script>` by default.
|
||||
- Python: use `uv` for local Python dependency/tooling commands when applicable.
|
||||
- Docker-managed backend runtime commands (e.g. `docker compose exec server python3 manage.py ...`) remain unchanged unless project tooling is explicitly migrated.
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Workflow Preference: Commit + Merge When Done
|
||||
- **Decision**: Once a requested branch workstream is complete and validated, commit and merge promptly (do not leave completed branches unmerged).
|
||||
- **Rationale**: User preference for immediate integration and reduced branch drift.
|
||||
- **Operational rule**:
|
||||
- Ensure quality checks pass for the completed change set.
|
||||
- Commit feature branch changes.
|
||||
- Merge into target branch without delaying.
|
||||
- Clean up merged worktrees/branches after merge.
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## WS1-F2 Review: Remove standalone /chat route
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Scope**: Deletion of `frontend/src/routes/chat/+page.svelte`, removal of `/chat` nav item and `mdiRobotOutline` import from Navbar.svelte.
|
||||
- **Findings**: No broken imports, navigation links, or route references remain. All `/chat` string matches in codebase are `/api/chat/conversations/` backend API proxy calls (correct). Orphaned `navbar.chat` i18n key noted as minor cleanup suggestion.
|
||||
- **Reference**: See [Plan: AI travel agent](plans/ai-travel-agent-collections-integration.md#task-ws1-f2)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## WS1 Tester Validation: collections-ai-agent worktree
|
||||
- **Status**: PASS (Both Standard + Adversarial passes)
|
||||
- **Build**: `bun run build` artifacts validated via `.svelte-kit/adapter-node` and `build/` output. No `/chat` route in compiled manifest. `AITravelChat` SSR-inlined into collections page at `currentView === 'recommendations'` with `embedded: true`.
|
||||
- **Key findings**:
|
||||
- All 12 i18n keys used in `AITravelChat.svelte` confirmed present in `en.json`.
|
||||
- No `mdiRobotOutline`, `/chat` href, or chat nav references in any source `.svelte` files.
|
||||
- Navbar.svelte contains zero chat or robot icon references.
|
||||
- `CollectionRecommendationView` still renders after `AITravelChat` in recommendations view.
|
||||
- Build output is current: adapter-node manifest has 29 nodes (0-28) with no `/chat` page route.
|
||||
- **Adversarial**: 3 hypotheses tested (broken i18n keys, orphaned chat imports, missing embedded prop); all negative.
|
||||
- **MUTATION_ESCAPES**: 1/5 (minor: `embedded` prop boolean default not type-enforced; runtime safe).
|
||||
- **Reference**: See [Plan: AI travel agent](plans/ai-travel-agent-collections-integration.md)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Consolidated Review: AI Travel Agent + Collections + Provider Catalog
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Scope**: Full consolidated implementation in `collections-ai-agent` worktree — backend provider catalog endpoint (`GET /api/chat/providers/`), `CHAT_PROVIDER_CONFIG` with OpenCode Zen, dynamic provider selectors in `AITravelChat.svelte` and `settings/+page.svelte`, `ChatProviderCatalogEntry` type, chat embedding in Collections Recommendations, `/chat` route removal.
|
||||
- **Acceptance verification**:
|
||||
- AI chat embedded in Collections Recommendations: `collections/[id]/+page.svelte:1264` renders `<AITravelChat embedded={true} />` inside `currentView === 'recommendations'`.
|
||||
- No `/chat` route: `frontend/src/routes/chat/` directory absent, no Navbar chat/robot references.
|
||||
- All LiteLLM providers listed: `get_provider_catalog()` iterates `litellm.provider_list` (128 providers) + appends custom `CHAT_PROVIDER_CONFIG` entries.
|
||||
- OpenCode Zen supported: `opencode_zen` in `CHAT_PROVIDER_CONFIG` with `api_base=https://opencode.ai/zen/v1`, `default_model=openai/gpt-4o-mini`.
|
||||
- **Security**: `IsAuthenticated` on all chat endpoints, `get_queryset` scoped to `user=self.request.user`, no IDOR risk, API keys never exposed in catalog response, provider IDs validated before use.
|
||||
- **Prior findings confirmed**: WS1-F2 removal review, WS1 tester validation, LiteLLM provider research — all still valid and matching implementation.
|
||||
- **Reference**: See [Plan: AI travel agent](plans/ai-travel-agent-collections-integration.md)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Consolidated Tester Validation: collections-ai-agent worktree (Full Consolidation)
|
||||
- **Status**: PASS (Both Standard + Adversarial passes)
|
||||
- **Pipeline inputs validated**: Frontend build (bun run format+lint+check+build → PASS, 0 errors, 6 expected warnings); Backend system check (manage.py check → PASS, 0 issues).
|
||||
- **Key findings**:
|
||||
- All 12 i18n keys in `AITravelChat.svelte` confirmed present in `en.json`.
|
||||
- No `/chat` route file, no Navbar `/chat` href or `mdiRobotOutline` in any `.svelte` source.
|
||||
- Only `/chat` references are API proxy calls (`/api/chat/...`) — correct.
|
||||
- `ChatProviderCatalogEntry` type defined in `types.ts`; used correctly in both `AITravelChat.svelte` and `settings/+page.svelte`.
|
||||
- `opencode_zen` in `CHAT_PROVIDER_CONFIG` with `api_base`, appended by second loop in `get_provider_catalog()` since not in `litellm.provider_list`.
|
||||
- Provider validation in `send_message` view uses `is_chat_provider_available()` → 400 on invalid providers.
|
||||
- All agent tool functions scope DB queries to `user=user`.
|
||||
- `AITravelChat embedded={true}` correctly placed at `collections/[id]/+page.svelte:1264`.
|
||||
- **Adversarial**: 5 hypotheses tested:
|
||||
1. `None`/empty provider_id → `_normalize_provider_id` returns `""` → `is_chat_provider_available` returns `False` → 400 (safe).
|
||||
2. Provider not in `CHAT_PROVIDER_CONFIG` → rejected at `send_message` level → 400 (correct).
|
||||
3. `opencode_zen` not in `litellm.provider_list` → catalog second loop covers it (correct).
|
||||
4. `tool_iterations` never incremented → `MAX_TOOL_ITERATIONS` guard is dead code; infinite tool loop theoretically possible — **pre-existing bug**, same pattern in `main` branch, not introduced by this change.
|
||||
5. `api_base` exposed in catalog response — pre-existing non-exploitable information leakage noted in prior security review.
|
||||
- **MUTATION_ESCAPES**: 2/6 (tool_iterations dead guard; `embedded` boolean default not type-enforced — both pre-existing, runtime safe).
|
||||
- **Lesson checks**: All prior WS1 + security review findings confirmed; no contradictions.
|
||||
- **Reference**: See [Plan: AI travel agent](plans/ai-travel-agent-collections-integration.md)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Consolidated Security Review: collections-ai-agent worktree
|
||||
- **Verdict**: APPROVED (score 3)
|
||||
- **Lens**: Correctness + Security
|
||||
- **Scope**: Provider validation, API key handling, api_base/SSRF risks, auth/permission on provider catalog, /chat removal regressions.
|
||||
- **Findings**:
|
||||
- WARNING: `api_base` field exposed in provider catalog response to frontend despite frontend never using it (`llm_client.py:112,141`). Non-exploitable (server-defined constants), but unnecessary information leakage. (confidence: MEDIUM)
|
||||
- No CRITICAL issues found.
|
||||
- **Security verified**:
|
||||
- Provider IDs validated against `CHAT_PROVIDER_CONFIG` whitelist before any LLM call.
|
||||
- API keys Fernet-encrypted at rest, scoped to authenticated user, never returned in responses.
|
||||
- `api_base` is server-hardcoded only (no user input path) — no SSRF.
|
||||
- Provider catalog endpoint requires `IsAuthenticated`; returns same static catalog for all users.
|
||||
- Tool execution uses whitelist dispatch + allowed-kwargs filtering; all data queries scoped to `user=user`.
|
||||
- No IDOR: conversations filtered by user in queryset; tool operations filter/get by user.
|
||||
- **Prior reviews confirmed**: WS1-F2 APPROVED and WS1 tester PASS findings remain consistent in consolidated branch.
|
||||
- **Safe to proceed to testing**: Yes.
|
||||
- **Reference**: See [Plan: AI travel agent](plans/ai-travel-agent-collections-integration.md)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Critic Gate: OpenCode Zen Connection Error Fix
|
||||
- **Verdict**: APPROVED
|
||||
- **Scope**: Change default model from `openai/gpt-4o-mini` to `openai/gpt-5-nano`, improve error surfacing with sanitized messages, clean up `tool_choice`/`tools` kwargs — all in `backend/server/chat/llm_client.py`.
|
||||
- **Key guardrails**: (1) Error surfacing must NOT forward raw `exc.message` — map LiteLLM exception types to safe user-facing categories. (2) `@mdi/js` build failure is a missing `bun install`, not a baseline issue — must run `bun install` before validation. (3) WSGI→ASGI migration and `sync_to_async` ORM fixes are explicitly out of scope.
|
||||
- **Challenges accepted**: `gpt-5-nano` validity is research-based, not live-verified; mitigated by error surfacing fix making any remaining mismatch diagnosable.
|
||||
- **Files evaluated**: `backend/server/chat/llm_client.py:59-64,225-234,274-276`, `frontend/src/lib/components/AITravelChat.svelte:4`, `frontend/package.json:44`
|
||||
- **Reference**: See [Plan: OpenCode Zen connection error](plans/opencode-zen-connection-error.md#critic-gate)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Security Review: OpenCode Zen Connection Error + Model Selection
|
||||
- **Verdict**: APPROVED (score 3)
|
||||
- **Lens**: Security
|
||||
- **Scope**: Sanitized error handling, model override input validation, auth/permission integrity on send_message, localStorage usage for model preferences.
|
||||
- **Files reviewed**: `backend/server/chat/views.py`, `backend/server/chat/llm_client.py`, `frontend/src/lib/components/AITravelChat.svelte`, `backend/server/chat/agent_tools.py`, `backend/server/integrations/models.py`, `frontend/src/lib/types.ts`
|
||||
- **Findings**: No CRITICAL issues. 1 WARNING: pre-existing `api_base` exposure in provider catalog response (carried forward from prior review, decisions.md:103). Error surfacing uses class-based dispatch to hardcoded safe strings — critic guardrail confirmed satisfied. Model input used only as JSON field to `litellm.acompletion()` — no injection surface. Auth/IDOR protections unchanged. localStorage stores only `{provider_id: model_string}` — no secrets.
|
||||
- **Prior findings**: All confirmed consistent (api_base exposure, provider validation, IDOR scoping, error sanitization guardrail).
|
||||
- **Reference**: See [Plan: OpenCode Zen connection error](plans/opencode-zen-connection-error.md#reviewer-security-verdict)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Tester Validation: OpenCode Zen Model Selection + Error Surfacing
|
||||
- **Status**: PASS (Both Standard + Adversarial passes)
|
||||
- **Pipeline inputs validated**: `manage.py check` (PASS, 0 issues), `bun run check` (PASS, 0 errors, 6 pre-existing warnings), `bun run build` (Vite compilation PASS; EACCES on `build/` dir is pre-existing Docker permission issue), backend 30 tests (6 pre-existing failures matching documented baseline).
|
||||
- **Key targeted verifications**:
|
||||
- `opencode_zen` default model confirmed as `openai/gpt-5-nano` (changed from `gpt-4o-mini`).
|
||||
- `stream_chat_completion` accepts `model=None` with correct `None or default` fallback logic.
|
||||
- All empty/falsy model values (`""`, `" "`, `None`, `False`, `0`) produce `None` in views.py — default fallback engaged.
|
||||
- All 6 LiteLLM exception classes (`NotFoundError`, `AuthenticationError`, `RateLimitError`, `BadRequestError`, `Timeout`, `APIConnectionError`) produce sanitized hardcoded payloads — no raw exception text, `api_base`, or sensitive data leaked.
|
||||
- `_is_model_override_compatible` correctly bypasses prefix check for `api_base` gateways (opencode_zen) and enforces prefix for standard providers.
|
||||
- `tools`/`tool_choice` conditionally excluded from LiteLLM kwargs when `tools` is falsy.
|
||||
- i18n keys `chat.model_label` and `chat.model_placeholder` confirmed in `en.json`.
|
||||
- **Adversarial**: 9 hypotheses tested; all negative (no failures). Notable: `openai\n` normalizes to `openai` via `strip()` — correct and consistent with views.py.
|
||||
- **MUTATION_ESCAPES**: 0/5 — all 5 mutation checks detected by test cases.
|
||||
- **Pre-existing issues** (not introduced): `_merge_tool_call_delta` no upper bound on index (large index DoS); `tool_iterations` never incremented dead guard.
|
||||
- **Reference**: See [Plan: OpenCode Zen connection error](plans/opencode-zen-connection-error.md#tester-verdict-standard--adversarial)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Correctness Review: chat-loop-hardening
|
||||
- **Verdict**: APPROVED (score 6)
|
||||
- **Lens**: Correctness
|
||||
- **Scope**: Required-argument tool-error short-circuit in `send_message()` streaming loop, historical replay filtering in `_build_llm_messages()`, tool description improvements in `agent_tools.py`, and `tool_iterations` increment fix.
|
||||
- **Files reviewed**: `backend/server/chat/views/__init__.py`, `backend/server/chat/agent_tools.py`, `backend/server/chat/llm_client.py` (no hardening changes — confirmed stable)
|
||||
- **Acceptance criteria verification**:
|
||||
- AC1 (no repeated invalid-arg loops): ✓ — `_is_required_param_tool_error()` detects patterns via hardcoded set + regex. `return` exits generator after error event + `[DONE]`.
|
||||
- AC2 (error payloads not replayed): ✓ — short-circuit skips persistence; `_build_llm_messages()` filters historical tool-error messages.
|
||||
- AC3 (stream terminates coherently): ✓ — all 4 exit paths yield `[DONE]`.
|
||||
- AC4 (successful tool flows preserved): ✓ — new check is pass-through for non-error results.
|
||||
- **Findings**:
|
||||
- WARNING: [views/__init__.py:389-401] Multi-tool-call orphan state. When model returns N tool calls and call K (K>1) fails required-param validation, calls 1..K-1 are already persisted but the assistant message references all N tool_call IDs. Missing tool result causes LLM API errors on next conversation turn (caught by `_safe_error_payload`). (confidence: HIGH)
|
||||
- WARNING: [views/__init__.py:64-69] `_build_llm_messages` filters tool-role error messages but does not trim the corresponding assistant `tool_calls` array, creating the same orphan for historical messages. (confidence: HIGH)
|
||||
- **Suggestions**: `get_weather` error `"dates must be a non-empty list"` (agent_tools.py:601) does not match the `is/are required` regex. Mitigated by `MAX_TOOL_ITERATIONS` guard.
|
||||
- **Prior findings**: `tool_iterations` never-incremented bug (decisions.md:91,149) now fixed — line 349 increments correctly. Confirmed resolved.
|
||||
- **Reference**: See [Plan: chat-provider-fixes](plans/chat-provider-fixes.md#follow-up-fixes)
|
||||
- **Date**: 2026-03-09
|
||||
|
||||
## Correctness Review: OpenCode Zen Model Selection + Error Surfacing
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Lens**: Correctness
|
||||
- **Scope**: Model selection in chat composer, per-provider browser persistence, optional model override to backend, error category mapping, and OpenCode Zen default model fix across 4 files.
|
||||
- **Files reviewed**: `frontend/src/lib/components/AITravelChat.svelte`, `frontend/src/locales/en.json`, `backend/server/chat/views.py`, `backend/server/chat/llm_client.py`, `frontend/src/lib/types.ts`
|
||||
- **Findings**: No CRITICAL or WARNING issues. Two optional SUGGESTIONS (debounce localStorage writes on model input; add clarifying comment on `getattr` fallback pattern in `_safe_error_payload`).
|
||||
- **Verified paths**:
|
||||
- Model override end-to-end: frontend `trim() || undefined` → backend `strip() or None` → `stream_chat_completion(model=model)` → `completion_kwargs["model"] = model or default` — null/empty falls back correctly.
|
||||
- Per-provider persistence: `loadModelPref`/`saveModelPref` via `localStorage` with JSON parse error handling and SSR guards. Reactive blocks verified no infinite loop via `initializedModelProvider` sentinel.
|
||||
- Model-provider compatibility: `_is_model_override_compatible` skips validation for `api_base` gateways (OpenCode Zen), validates prefix for standard providers, allows bare model names.
|
||||
- Error surfacing: 6 LiteLLM exception types mapped to sanitized messages; no raw `exc.message` exposure; critic guardrail satisfied.
|
||||
- Tools/tool_choice: conditionally included only when `tools` is truthy; no `None` kwargs to LiteLLM.
|
||||
- i18n: `chat.model_label` and `chat.model_placeholder` confirmed in `en.json`.
|
||||
- Type safety: `ChatProviderCatalogEntry.default_model: string | null` handled with null-safe operators throughout.
|
||||
- **Prior findings**: Critic gate guardrails (decisions.md:117-124) all confirmed followed. `api_base` catalog exposure (decisions.md:103) unchanged/pre-existing. `tool_iterations` never-incremented bug (decisions.md:91) pre-existing, not affected.
|
||||
- **Reference**: See [Plan: OpenCode Zen connection error](plans/opencode-zen-connection-error.md#reviewer-correctness-verdict)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Critic Gate: Travel Agent Context + Models Follow-up
|
||||
- **Verdict**: APPROVED
|
||||
- **Scope**: Three follow-up fixes — F1 (expand opencode_zen model dropdown), F2 (collection-level context injection instead of single-location), F3 (itinerary-centric quick-action prompts + `.places`→`.results` bug fix).
|
||||
- **Key findings**: All source-level edit points verified current. F3a `.places`/`.results` key mismatch confirmed as critical rendering bug (place cards never display). F2 `values_list("name")` alone insufficient — need `city__name`/`country__name` for geocodable context. F1 model list should exclude reasoning models (`o1-preview`, `o1-mini`) pending tool-use compatibility verification.
|
||||
- **Execution order**: F1 → F2 → F3 (F3 depends on F2's `deriveCollectionDestination` changes).
|
||||
- **Files evaluated**: `backend/server/chat/views/__init__.py:144-168,417-418`, `backend/server/chat/llm_client.py:310-358`, `backend/server/chat/agent_tools.py:128,311-391`, `frontend/src/lib/components/AITravelChat.svelte:44,268,372-386,767-804`, `frontend/src/routes/collections/[id]/+page.svelte:259-280,1287-1294`, `backend/server/adventures/models.py:153-170,275-307`
|
||||
- **Reference**: See [Plan: Travel agent context + models](plans/travel-agent-context-and-models.md#critic-gate)
|
||||
- **Date**: 2026-03-09
|
||||
|
||||
## WS1 Configuration Infrastructure Backend Review
|
||||
- **Verdict**: CHANGES-REQUESTED (score 6)
|
||||
- **Lens**: Correctness + Security
|
||||
- **Scope**: WS1 backend implementation — `settings.py` env vars, `llm_client.py` fallback chain + catalog enhancement, `UserAISettings` model/serializer/ViewSet/migration, provider catalog user passthrough in `chat/views.py`.
|
||||
- **Findings**:
|
||||
- WARNING: Redundant instance-key fallback in `stream_chat_completion()` at `llm_client.py:328-331`. `get_llm_api_key()` (lines 262-266) already implements identical fallback logic. The duplicate creates divergence risk. (confidence: HIGH)
|
||||
- WARNING: `VOYAGE_AI_MODEL` env var defined at `settings.py:408` but never consumed by any code. Instance admins who set it will see no effect — model selection uses `CHAT_PROVIDER_CONFIG[provider]["default_model"]` or user override. False promise creates support burden. (confidence: HIGH)
|
||||
- **Security verified**:
|
||||
- Instance API key (`VOYAGE_AI_API_KEY`) only returned when provider matches `VOYAGE_AI_PROVIDER` — no cross-provider key leakage.
|
||||
- `UserAISettings` endpoint requires `IsAuthenticated`; queryset scoped to `request.user`; no IDOR.
|
||||
- Catalog `instance_configured`/`user_configured` booleans expose only presence (not key values) — safe.
|
||||
- N+1 avoided: single `values_list()` prefetch for user API keys in `get_provider_catalog()`.
|
||||
- Migration correctly depends on `0007_userapikey_userrecommendationpreferenceprofile` + swappable `AUTH_USER_MODEL`.
|
||||
- ViewSet follows exact pattern of existing `UserRecommendationPreferenceProfileViewSet` (singleton upsert via `_upserted_instance`).
|
||||
- **Suggestions**: (1) `ModelViewSet` exposes unneeded DELETE/PUT/PATCH — could restrict to Create+List mixins. (2) `preferred_model` max_length=100 may be tight for future model names.
|
||||
- **Next**: Remove redundant fallback lines 328-331 in `llm_client.py`. Wire `VOYAGE_AI_MODEL` into model resolution or remove it from settings.
|
||||
- **Prior findings**: `api_base` catalog exposure (decisions.md:103) still pre-existing. `_upserted_instance` thread-safety pattern consistent with existing code — pre-existing, not new.
|
||||
- **Reference**: See [Plan: AI travel agent redesign](plans/ai-travel-agent-redesign.md#ws1-configuration-infrastructure)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Correctness Review: suggestion-add-flow
|
||||
- **Verdict**: APPROVED (score 3)
|
||||
- **Lens**: Correctness
|
||||
- **Scope**: Day suggestions provider/model resolution, suggestion normalization, add-item flow creating location + itinerary entry.
|
||||
- **Files reviewed**: `backend/server/chat/views/day_suggestions.py`, `frontend/src/lib/components/collections/ItinerarySuggestionModal.svelte`, plus cross-referenced `llm_client.py`, `location_view.py`, `models.py`, `serializers.py`, `CollectionItineraryPlanner.svelte`.
|
||||
- **Findings**:
|
||||
- WARNING: Hardcoded `"gpt-4o-mini"` fallback at `day_suggestions.py:251` — if provider config has no `default_model` and no model is resolved, this falls back to an OpenAI model string even for non-OpenAI providers. Contradicts "no hardcoded OpenAI" acceptance criterion at the deep fallback layer. (confidence: HIGH)
|
||||
- No CRITICAL issues.
|
||||
- **Verified paths**:
|
||||
- Provider/model resolution follows correct precedence: request → UserAISettings → VOYAGE_AI_PROVIDER/MODEL → provider config default. `VOYAGE_AI_MODEL` is now consumed (resolves prior WARNING from decisions.md:186).
|
||||
- Add-item flow: `handleAddSuggestion` → `buildLocationPayload` → POST `/api/locations/` (name/description/location/rating/collections/is_public) → `dispatch('addItem', {type, itemId, updateDate})` → parent `addItineraryItemForObject`. Event shape matches parent handler exactly.
|
||||
- Normalization: `normalizeSuggestionItem` handles LLM response variants (title/place_name/venue, summary/details, address/neighborhood) defensively. Items without resolvable name are dropped. `normalizeRating` safely extracts numeric values. Not overly broad.
|
||||
- Auth: `IsAuthenticated` + collection owner/shared_with check. CSRF handled by API proxy. No IDOR.
|
||||
- **Next**: Replace `or "gpt-4o-mini"` on line 251 with a raise or log if no model resolved, removing the last OpenAI-specific hardcoding.
|
||||
- **Reference**: See [Plan: Chat provider fixes](plans/chat-provider-fixes.md#suggestion-add-flow)
|
||||
- **Date**: 2026-03-09
|
||||
|
||||
## Correctness Review: default-ai-settings
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Lens**: Correctness + Security
|
||||
- **Scope**: DB-backed default AI provider/model settings — Settings UI save/reload, Chat component initialization from saved defaults, backend send_message fallback, localStorage override prevention.
|
||||
- **Files reviewed**: `frontend/src/routes/settings/+page.server.ts` (lines 112-121, 146), `frontend/src/routes/settings/+page.svelte` (lines 50-173, 237-239, 1676-1733), `frontend/src/lib/components/AITravelChat.svelte` (lines 82-134, 199-212), `backend/server/chat/views/__init__.py` (lines 183-216), `backend/server/integrations/views/ai_settings_view.py`, `backend/server/integrations/serializers.py` (lines 104-114), `backend/server/integrations/models.py` (lines 129-146), `frontend/src/lib/types.ts`, `frontend/src/locales/en.json`.
|
||||
- **Acceptance criteria**:
|
||||
1. ✅ Settings UI save/reload: server-side loads `aiSettings` (page.server.ts:112-121), frontend initializes with normalization (page.svelte:50-51), saves via POST with re-validation (page.svelte:135-173), template renders provider/model selects (page.svelte:1676-1733).
|
||||
2. ✅ Chat initializes from saved defaults: `loadUserAISettings()` fetches from DB (AITravelChat:87-107), `applyInitialDefaults()` applies with validation (AITravelChat:109-134).
|
||||
3. ✅ localStorage doesn't override DB: `saveModelPref()` writes only (AITravelChat:199-212); old `loadModelPref()` reader removed.
|
||||
4. ✅ Backend fallback safe: requested → preferred (if available) → "openai" (views/__init__.py:195-201); model gated by `provider == preferred_provider` (views/__init__.py:204).
|
||||
- **Verified paths**:
|
||||
- Provider normalization consistent (`.trim().toLowerCase()`) across settings, chat, backend. Model normalization (`.trim()` only) correct — model IDs are case-sensitive.
|
||||
- Upsert semantics correct: `perform_create` checks for existing, updates in place. Returns 200 OK; frontend checks `res.ok`. Matches `OneToOneField` constraint.
|
||||
- CSRF: transparent via API proxy. Auth: `IsAuthenticated` + user-scoped queryset. No IDOR.
|
||||
- Empty/null edge cases: `preferred_model: defaultAiModel || null` sends null for empty. Backend `or ""` normalization handles None. Robust.
|
||||
- Stale provider/model: validated against configured providers (page.svelte:119) and loaded models (page.svelte:125-127); falls back correctly.
|
||||
- Async ordering: sequential awaits correct (loadProviderCatalog → initializeDefaultAiSettings; Promise.all → applyInitialDefaults).
|
||||
- Race prevention: `initialDefaultsApplied` flag, `loadedModelsForProvider` guard.
|
||||
- Contract: serializer fields match frontend `UserAISettings` type. POST body matches serializer.
|
||||
- **No CRITICAL or WARNING findings.**
|
||||
- **Prior findings confirmed**: `preferred_model` max_length=100 and `ModelViewSet` excess methods (decisions.md:212) remain pre-existing, not introduced here.
|
||||
- **Reference**: See [Plan: Chat provider fixes](plans/chat-provider-fixes.md#default-ai-settings)
|
||||
- **Date**: 2026-03-09
|
||||
|
||||
## Re-Review: suggestion-add-flow (OpenAI fallback removal)
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Lens**: Correctness (scoped re-review)
|
||||
- **Scope**: Verification that the WARNING from decisions.md:224 (hardcoded `or "gpt-4o-mini"` fallback in `_get_suggestions_from_llm`) is resolved, and no new issues introduced.
|
||||
- **Original finding resolved**: ✅ — `day_suggestions.py:251` now reads `resolved_model = model or provider_config.get("default_model")` with no OpenAI fallback. Lines 252-253 raise `ValueError("No model configured for provider")` if `resolved_model` is falsy. Grep confirms zero `gpt-4o-mini` occurrences in `backend/server/chat/`.
|
||||
- **No new issues introduced**:
|
||||
- `ValueError` at line 253 is safely caught by `except Exception` at line 87, returning generic 500 response.
|
||||
- `CHAT_PROVIDER_CONFIG.get(provider, {})` at line 250 handles `None` provider safely (returns `{}`).
|
||||
- Double-resolution of `provider_config` (once in `_resolve_provider_and_model:228`, again in `_get_suggestions_from_llm:250`) is redundant but harmless — defensive fallback consistent with streaming chat path.
|
||||
- Provider resolution chain at lines 200-241 intact: request → user settings → instance settings → OpenAI availability check. Model gated by `provider == preferred_provider` (line 237) prevents cross-provider model mismatches.
|
||||
- **Reference**: See [Plan: Chat provider fixes](plans/chat-provider-fixes.md#suggestion-add-flow), prior finding at decisions.md:224
|
||||
- **Date**: 2026-03-09
|
||||
|
||||
## Re-Review: chat-loop-hardening multi-tool-call orphan fix
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Lens**: Correctness (targeted re-review)
|
||||
- **Scope**: Fix for multi-tool-call partial failure orphaned context — `_build_llm_messages()` trimming and `send_message()` successful-prefix persistence.
|
||||
- **Original findings status**:
|
||||
- WARNING (decisions.md:164): Multi-tool-call orphan in streaming loop — **RESOLVED**. `send_message()` now accumulates `successful_tool_calls`/`successful_tool_messages` and persists only those on required-arg failure (lines 365-426). First-call failure correctly omits `tool_calls` from assistant message entirely (line 395 guard).
|
||||
- WARNING (decisions.md:165): `_build_llm_messages` assistant `tool_calls` not trimmed — **RESOLVED**. Lines 59-65 build `valid_tool_call_ids` from non-error tool messages; lines 85-91 filter assistant `tool_calls` to only matching IDs; empty result omits `tool_calls` key entirely.
|
||||
- **New issues introduced**: None. Defensive null handling (`(tool_call or {}).get("id")`) correct. No duplicate persistence risk (failure path returns, success path continues). In-memory `current_messages` and persisted messages stay consistent.
|
||||
- **Reference**: See [Plan: Chat provider fixes](plans/chat-provider-fixes.md#chat-loop-hardening)
|
||||
- **Date**: 2026-03-09
|
||||
|
||||
## Re-Review: normalize_gateway_model + day-suggestions error handling
|
||||
- **Verdict**: APPROVED (score 3)
|
||||
- **Lens**: Correctness
|
||||
- **Scope**: `normalize_gateway_model()` helper in `llm_client.py`, its integration in both `stream_chat_completion()` and `DaySuggestionsView._get_suggestions_from_llm()`, `_safe_error_payload` adoption in day suggestions, `temperature` kwarg removal, and exception logging addition.
|
||||
- **Changes verified**:
|
||||
- `normalize_gateway_model` correctly prefixes bare `opencode_zen` model IDs with `openai/`, passes through all other models, and returns `None` for empty/None input.
|
||||
- `stream_chat_completion:420` calls `normalize_gateway_model` after model resolution but before `supports_function_calling` check — correct ordering.
|
||||
- `day_suggestions.py:266-271` normalizes resolved model and guards against `None` with `ValueError` — resolves prior WARNING about hardcoded `gpt-4o-mini` fallback (decisions.md:224).
|
||||
- `day_suggestions.py:93-106` uses `_safe_error_payload` with status-code mapping dict — LiteLLM exceptions get appropriate HTTP codes (400/401/429/503), `ValueError` falls through to generic 500.
|
||||
- `temperature` kwarg confirmed absent from `completion_kwargs` — resolves `UnsupportedParamsError` on `gpt-5-nano`.
|
||||
- `logger.exception` at line 94 ensures full tracebacks for debugging.
|
||||
- **Findings**:
|
||||
- WARNING: `stream_chat_completion:420` has no `None` guard on `normalize_gateway_model` return, unlike `day_suggestions.py:270-271`. Currently unreachable (resolution chain always yields non-empty model from `CHAT_PROVIDER_CONFIG`), but defensive guard would make contract explicit. (confidence: LOW)
|
||||
- **Prior findings**: hardcoded `gpt-4o-mini` WARNING (decisions.md:224) confirmed resolved. `_safe_error_payload` sanitization guardrail (decisions.md:120) confirmed satisfied.
|
||||
- **Reference**: See [Plan: Chat provider fixes](plans/chat-provider-fixes.md#suggestion-add-flow)
|
||||
- **Date**: 2026-03-09
|
||||
0
.memory/gates/.gitkeep
Normal file
0
.memory/gates/.gitkeep
Normal file
13
.memory/knowledge.md
Normal file
13
.memory/knowledge.md
Normal file
@@ -0,0 +1,13 @@
|
||||
# DEPRECATED — Migrated to nested structure (2026-03-09)
|
||||
|
||||
This file has been superseded. Content has been migrated to:
|
||||
|
||||
- **[system.md](system.md)** — Project overview (one paragraph)
|
||||
- **[knowledge/overview.md](knowledge/overview.md)** — Architecture, services, auth, file locations
|
||||
- **[knowledge/tech-stack.md](knowledge/tech-stack.md)** — Stack, dev commands, environment, known issues
|
||||
- **[knowledge/conventions.md](knowledge/conventions.md)** — Coding conventions and workflow rules
|
||||
- **[knowledge/patterns/chat-and-llm.md](knowledge/patterns/chat-and-llm.md)** — Chat/LLM patterns, agent tools, error mapping, OpenCode Zen
|
||||
- **[knowledge/domain/collections-and-sharing.md](knowledge/domain/collections-and-sharing.md)** — Collection sharing, itinerary, user preferences
|
||||
- **[knowledge/domain/ai-configuration.md](knowledge/domain/ai-configuration.md)** — WS1 config infrastructure, frontend gaps
|
||||
|
||||
See [manifest.yaml](manifest.yaml) for the full index.
|
||||
20
.memory/knowledge/conventions.md
Normal file
20
.memory/knowledge/conventions.md
Normal file
@@ -0,0 +1,20 @@
|
||||
# Coding Conventions & Patterns
|
||||
|
||||
## Frontend Patterns
|
||||
- **i18n**: Use `$t('key')` from `svelte-i18n`; add keys to locale files
|
||||
- **API calls**: Always use `credentials: 'include'` and `X-CSRFToken` header
|
||||
- **Svelte reactivity**: Reassign to trigger: `items[i] = updated; items = items`
|
||||
- **Styling**: DaisyUI semantic classes + Tailwind utilities; use `bg-primary`, `text-base-content` not raw colors
|
||||
- **Maps**: `svelte-maplibre` with MapLibre GL; GeoJSON data
|
||||
|
||||
## Backend Patterns
|
||||
- **Views**: DRF `ModelViewSet` subclasses; `get_queryset()` filters by `user=self.request.user`
|
||||
- **Money**: `djmoney` MoneyField
|
||||
- **Geo**: PostGIS via `django-geojson`
|
||||
- **Chat providers**: Dynamic catalog from `GET /api/chat/providers/`; configured in `CHAT_PROVIDER_CONFIG`
|
||||
|
||||
## Workflow Conventions
|
||||
- Do **not** attempt to fix known test/configuration issues as part of feature work
|
||||
- Use `bun` for frontend commands, `uv` for local Python tooling where applicable
|
||||
- Commit and merge completed feature branches promptly once validation passes (avoid leaving finished work unmerged)
|
||||
- See [decisions.md](../decisions.md#workflow-preference-commit--merge-when-done) for rationale
|
||||
44
.memory/knowledge/domain/ai-configuration.md
Normal file
44
.memory/knowledge/domain/ai-configuration.md
Normal file
@@ -0,0 +1,44 @@
|
||||
# AI Configuration Domain
|
||||
|
||||
## WS1 Configuration Infrastructure
|
||||
|
||||
### WS1-F1: Instance-level env vars and key fallback
|
||||
- `settings.py`: `VOYAGE_AI_PROVIDER`, `VOYAGE_AI_MODEL`, `VOYAGE_AI_API_KEY`
|
||||
- `get_llm_api_key(user, provider)` falls back to instance key only when provider matches `VOYAGE_AI_PROVIDER`
|
||||
- Fallback chain: user key -> matching-provider instance key -> error
|
||||
- See [tech-stack.md](../tech-stack.md#server-side-env-vars-from-settingspy), [decisions.md](../../decisions.md#ws1-configuration-infrastructure-backend-review)
|
||||
|
||||
### WS1-F2: UserAISettings model
|
||||
- `integrations/models.py`: `UserAISettings` (OneToOneField to user) with `preferred_provider` and `preferred_model`
|
||||
- Endpoint: `/api/integrations/ai-settings/` (upsert pattern)
|
||||
- Migration: `0008_useraisettings.py`
|
||||
|
||||
### WS1-F3: Provider catalog enhancement
|
||||
- `get_provider_catalog(user=None)` adds `instance_configured` and `user_configured` booleans
|
||||
- User API keys prefetched once per request (no N+1)
|
||||
- `ChatProviderCatalogEntry` TypeScript type updated with both fields
|
||||
|
||||
### Frontend Provider Selection (Fixed)
|
||||
- No longer hardcodes `selectedProvider = 'openai'`; auto-selects first usable provider
|
||||
- Filtered to configured+usable entries only (`available_for_chat && (user_configured || instance_configured)`)
|
||||
- Warning alert + Settings link when no providers configured
|
||||
- Model selection uses dropdown from `GET /api/chat/providers/{provider}/models/`
|
||||
|
||||
## Known Frontend Gaps
|
||||
|
||||
### Root Cause of User-Facing LLM Errors
|
||||
Three compounding issues (all resolved):
|
||||
1. ~~Hardcoded `'openai'` default~~ (fixed: auto-selects first usable)
|
||||
2. ~~No provider status feedback~~ (fixed: catalog fields consumed)
|
||||
3. ~~`UserAISettings.preferred_provider` never loaded~~ (fixed: Settings UI saves/loads DB defaults; chat initializes from saved prefs)
|
||||
4. `FIELD_ENCRYPTION_KEY` not set disables key storage (env-dependent)
|
||||
5. ~~TypeScript type missing fields~~ (fixed)
|
||||
|
||||
## Key Edit Reference Points
|
||||
| Feature | File | Location |
|
||||
|---|---|---|
|
||||
| AI env vars | `backend/server/main/settings.py` | after `FIELD_ENCRYPTION_KEY` |
|
||||
| Fallback key | `backend/server/chat/llm_client.py` | `get_llm_api_key()` |
|
||||
| UserAISettings model | `backend/server/integrations/models.py` | after UserAPIKey |
|
||||
| Catalog user flags | `backend/server/chat/llm_client.py` | `get_provider_catalog()` |
|
||||
| Provider view | `backend/server/chat/views/__init__.py` | `ChatProviderCatalogViewSet` |
|
||||
62
.memory/knowledge/domain/collections-and-sharing.md
Normal file
62
.memory/knowledge/domain/collections-and-sharing.md
Normal file
@@ -0,0 +1,62 @@
|
||||
# Collections & Sharing Domain
|
||||
|
||||
## Collection Sharing Architecture
|
||||
|
||||
### Data Model
|
||||
- `Collection.shared_with` - `ManyToManyField(User, related_name='shared_with', blank=True)` - the primary access grant
|
||||
- `CollectionInvite` - staging table for pending invites: `(collection FK, invited_user FK, unique_together)`; prevents self-invite and double-invite
|
||||
- Both models in `backend/server/adventures/models.py`
|
||||
|
||||
### Access Flow (Invite -> Accept -> Membership)
|
||||
1. Owner calls `POST /api/collections/{id}/share/{uuid}/` -> creates `CollectionInvite`
|
||||
2. Invited user sees pending invites via `GET /api/collections/invites/`
|
||||
3. Invited user calls `POST /api/collections/{id}/accept-invite/` -> adds to `shared_with`, deletes invite
|
||||
4. Owner revokes: `POST /api/collections/{id}/revoke-invite/{uuid}/`
|
||||
5. User declines: `POST /api/collections/{id}/decline-invite/`
|
||||
6. Owner removes: `POST /api/collections/{id}/unshare/{uuid}/` - removes user's locations from collection
|
||||
7. User self-removes: `POST /api/collections/{id}/leave/` - removes their locations
|
||||
|
||||
### Permission Classes
|
||||
- `CollectionShared` - full access for owner and `shared_with` members; invite actions for invitees; anonymous read for `is_public`
|
||||
- `IsOwnerOrSharedWithFullAccess` - child-object viewsets; traverses `obj.collections`/`obj.collection` to check `shared_with`
|
||||
- `ContentImagePermission` - delegates to `IsOwnerOrSharedWithFullAccess` on `content_object`
|
||||
|
||||
### Key Design Constraints
|
||||
- No role differentiation - all shared users have same write access
|
||||
- On unshare/leave, departing user's locations are removed from collection (not deleted)
|
||||
- `duplicate` action creates a private copy with no `shared_with` transfer
|
||||
|
||||
## Itinerary Architecture
|
||||
|
||||
### Primary Component
|
||||
`CollectionItineraryPlanner.svelte` (~3818 lines) - monolith rendering the entire itinerary view and all modals.
|
||||
|
||||
### The "Add" Button
|
||||
DaisyUI dropdown at bottom of each day card with "Link existing item" and "Create new" submenu (Location, Lodging, Transportation, Note, Checklist).
|
||||
|
||||
### Day Suggestions Modal (WS3)
|
||||
- Component: `ItinerarySuggestionModal.svelte`
|
||||
- Props: `collection`, `user`, `targetDate`, `displayDate`
|
||||
- Events: `close`, `addItem { type: 'location', itemId, updateDate }`
|
||||
- 3-step UX: category selection -> filters -> results from `POST /api/chat/suggestions/day/`
|
||||
|
||||
## User Recommendation Preference Profile
|
||||
Backend-only feature: model, API, and system-prompt integration exist, but **no frontend UI** built yet.
|
||||
|
||||
### Auto-learned preference updates
|
||||
- `backend/server/integrations/utils/auto_profile.py` derives profile from user history
|
||||
- Fields: `interests` (from activities + locations), `trip_style` (from lodging), `notes` (frequently visited regions)
|
||||
- `ChatViewSet.send_message()` invokes `update_auto_preference_profile(request.user)` at method start
|
||||
|
||||
### Model
|
||||
`UserRecommendationPreferenceProfile` (OneToOne -> `CustomUser`): `cuisines`, `interests` (JSONField), `trip_style`, `notes`, timestamps.
|
||||
|
||||
### System Prompt Integration
|
||||
- Single-user: labeled as auto-inferred with structured markdown section
|
||||
- Shared collections: `get_aggregated_preferences(collection)` appends per-participant preferences
|
||||
- Missing profiles skipped gracefully
|
||||
|
||||
### Frontend Gap
|
||||
- No settings tab for manual preference editing
|
||||
- TypeScript type available as `UserRecommendationPreferenceProfile` in `src/lib/types.ts`
|
||||
- See [Plan: AI travel agent redesign](../../plans/ai-travel-agent-redesign.md#ws2-user-preference-learning)
|
||||
40
.memory/knowledge/overview.md
Normal file
40
.memory/knowledge/overview.md
Normal file
@@ -0,0 +1,40 @@
|
||||
# Architecture Overview
|
||||
|
||||
## API Proxy Pattern
|
||||
Frontend never calls Django directly. All API calls go through `src/routes/api/[...path]/+server.ts` → Django at `http://server:8000`. Frontend uses relative URLs like `/api/locations/`.
|
||||
|
||||
## AI Chat (Collections → Recommendations)
|
||||
- AI travel chat is embedded in **Collections → Recommendations** via `AITravelChat.svelte` (no standalone `/chat` route).
|
||||
- Provider selector loads dynamically from `GET /api/chat/providers/` (backed by `litellm.provider_list` + `CHAT_PROVIDER_CONFIG` in `backend/server/chat/llm_client.py`).
|
||||
- Supported configured providers: OpenAI, Anthropic, Gemini, Ollama, Groq, Mistral, GitHub Models, OpenRouter, OpenCode Zen (`opencode_zen`, `api_base=https://opencode.ai/zen/v1`, default model `openai/gpt-5-nano`).
|
||||
- Chat conversations stream via SSE through `/api/chat/conversations/`.
|
||||
- `ChatViewSet.send_message()` accepts optional context fields (`collection_id`, `collection_name`, `start_date`, `end_date`, `destination`) and appends a `## Trip Context` section to the system prompt when provided. When a `collection_id` is present, also injects `Itinerary stops:` from `collection.locations` (up to 8 unique stops). See [patterns/chat-and-llm.md](patterns/chat-and-llm.md#multi-stop-context-derivation).
|
||||
- Chat composer supports per-provider model override (persisted in browser `localStorage` key `voyage_chat_model_prefs`). DB-saved default provider/model (`UserAISettings`) is authoritative on initialization; localStorage is write-only sync target. Backend `send_message` accepts optional `model` param; falls back to DB defaults → instance defaults → `"openai"`.
|
||||
- Invalid required-argument tool calls are detected and short-circuited: stream terminates with `tool_validation_error` SSE event + `[DONE]` and invalid tool results are not replayed into conversation history. See [patterns/chat-and-llm.md](patterns/chat-and-llm.md#tool-call-error-handling-chat-loop-hardening).
|
||||
- LiteLLM errors mapped to sanitized user-safe messages via `_safe_error_payload()` (never exposes raw exception text). See [patterns/chat-and-llm.md](patterns/chat-and-llm.md#sanitized-llm-error-mapping).
|
||||
- Frontend type: `ChatProviderCatalogEntry` in `src/lib/types.ts`.
|
||||
- Reference: [Plan: AI travel agent](../plans/ai-travel-agent-collections-integration.md), [Plan: AI travel agent redesign — WS4](../plans/ai-travel-agent-redesign.md#ws4-collection-level-chat-improvements)
|
||||
|
||||
## Services (Docker Compose)
|
||||
| Service | Container | Port |
|
||||
|---------|-----------|------|
|
||||
| Frontend | `web` | :8015 |
|
||||
| Backend | `server` | :8016 |
|
||||
| Database | `db` | :5432 |
|
||||
| Cache | `cache` | internal |
|
||||
|
||||
## Authentication
|
||||
Session-based via `django-allauth`. CSRF tokens from `/auth/csrf/`, passed as `X-CSRFToken` header. Mobile clients use `X-Session-Token` header.
|
||||
|
||||
## Key File Locations
|
||||
- Frontend source: `frontend/src/`
|
||||
- Backend source: `backend/server/`
|
||||
- Django apps: `adventures/`, `users/`, `worldtravel/`, `integrations/`, `achievements/`, `chat/`
|
||||
- Chat LLM config: `backend/server/chat/llm_client.py` (`CHAT_PROVIDER_CONFIG`)
|
||||
- AI Chat component: `frontend/src/lib/components/AITravelChat.svelte`
|
||||
- Types: `frontend/src/lib/types.ts`
|
||||
- API proxy: `frontend/src/routes/api/[...path]/+server.ts`
|
||||
- i18n: `frontend/src/locales/`
|
||||
- Docker config: `docker-compose.yml`, `docker-compose.dev.yml`
|
||||
- CI/CD: `.github/workflows/`
|
||||
- Public docs: `documentation/` (VitePress)
|
||||
133
.memory/knowledge/patterns/chat-and-llm.md
Normal file
133
.memory/knowledge/patterns/chat-and-llm.md
Normal file
@@ -0,0 +1,133 @@
|
||||
# Chat & LLM Patterns
|
||||
|
||||
## Default AI Settings & Model Override
|
||||
|
||||
### DB-backed defaults (authoritative)
|
||||
- **Model**: `UserAISettings` (OneToOneField, `integrations/models.py`) stores `preferred_provider` and `preferred_model` per user.
|
||||
- **Endpoint**: `GET/POST /api/integrations/ai-settings/` — upsert pattern (OneToOneField + `perform_create` update-or-create).
|
||||
- **Settings UI**: `settings/+page.svelte` loads/saves default provider and model. Provider dropdown filtered to configured providers; model dropdown from `GET /api/chat/providers/{provider}/models/`.
|
||||
- **Chat initialization**: `AITravelChat.svelte` `loadUserAISettings()` fetches saved defaults on mount and applies them as authoritative initial provider/model. Direction is DB → localStorage (not reverse).
|
||||
- **Backend fallback precedence** in `send_message()`:
|
||||
1. Explicit request payload (`provider`, `model`)
|
||||
2. `UserAISettings.preferred_provider` / `preferred_model` (only when provider matches)
|
||||
3. Instance defaults (`VOYAGE_AI_PROVIDER`, `VOYAGE_AI_MODEL`)
|
||||
4. `"openai"` hardcoded fallback
|
||||
- **Cross-provider guard**: `preferred_model` only applied when resolved provider == `preferred_provider` (prevents e.g. `gpt-5-nano` leaking to Anthropic).
|
||||
|
||||
### Per-session model override (browser-only)
|
||||
- **Frontend**: model dropdown next to provider selector, populated by `GET /api/chat/providers/{provider}/models/`.
|
||||
- **Persistence**: `localStorage` key `voyage_chat_model_prefs` — written on selection, but never overrides DB defaults on initialization (DB wins).
|
||||
- **Compatibility guard**: `_is_model_override_compatible()` validates model prefix for standard providers; skips check for `api_base` gateways (e.g. `opencode_zen`).
|
||||
- **i18n keys**: `chat.model_label`, `chat.model_placeholder`, `default_ai_settings_title`, `default_ai_settings_desc`, `default_ai_save`, `default_ai_settings_saved`, `default_ai_settings_error`, `default_ai_provider_required`, `default_ai_no_providers`.
|
||||
|
||||
## Sanitized LLM Error Mapping
|
||||
- `_safe_error_payload()` in `backend/server/chat/llm_client.py` maps LiteLLM exception classes to hardcoded user-safe strings with `error_category` field.
|
||||
- Exception classes mapped: `NotFoundError` -> "model not found", `AuthenticationError` -> "authentication", `RateLimitError` -> "rate limit", `BadRequestError` -> "bad request", `Timeout` -> "timeout", `APIConnectionError` -> "connection".
|
||||
- Raw `exc.message`, `str(exc)`, and `exc.args` are **never** forwarded to the client. Server-side `logger.exception()` logs full details.
|
||||
- Uses `getattr(litellm.exceptions, "ClassName", tuple())` for resilient class lookup.
|
||||
- Security guardrail from critic gate: [decisions.md](../../decisions.md#critic-gate-opencode-zen-connection-error-fix).
|
||||
|
||||
## Tool Call Error Handling (Chat Loop Hardening)
|
||||
- **Required-arg detection**: `_is_required_param_tool_error()` matches tool results containing `"is required"` / `"are required"` patterns via regex. Detects errors like `"location is required"`, `"query is required"`, `"collection_id, name, latitude, and longitude are required"`.
|
||||
- **Short-circuit on invalid tool calls**: When a tool call returns a required-param error, `send_message()` yields an SSE error event with `error_category: "tool_validation_error"` and immediately terminates the stream with `[DONE]`. No further LLM turns are attempted.
|
||||
- **Persistence skip**: Invalid tool call results (and the tool_call entry itself) are NOT persisted to the database, preventing replay into future conversation turns.
|
||||
- **Historical cleanup**: `_build_llm_messages()` filters persisted tool-role messages containing required-param errors AND trims the corresponding assistant `tool_calls` array to only IDs that have non-filtered tool messages. Empty `tool_calls` arrays are omitted entirely.
|
||||
- **Multi-tool partial success**: When model returns N tool calls and call K fails, calls 1..K-1 (the successful prefix) are persisted normally. Only the failed call and subsequent calls are dropped.
|
||||
- **Tool iteration guard**: `MAX_TOOL_ITERATIONS = 10` with correctly-incremented counter prevents unbounded loops from other error classes (e.g. `"dates must be a non-empty list"` from `get_weather` does NOT match the required-arg regex but is bounded by iteration limit).
|
||||
- **Known gap**: `get_weather` error `"dates must be a non-empty list"` does not trigger the short-circuit — mitigated by `MAX_TOOL_ITERATIONS`.
|
||||
|
||||
## OpenCode Zen Provider
|
||||
- Provider ID: `opencode_zen`
|
||||
- `api_base`: `https://opencode.ai/zen/v1`
|
||||
- Default model: `openai/gpt-5-nano` (changed from `openai/gpt-4o-mini` which was invalid on Zen)
|
||||
- GPT models on Zen use `/chat/completions` endpoint (OpenAI-compatible)
|
||||
- LiteLLM `openai/` prefix routes through OpenAI client to the custom `api_base`
|
||||
- Model dropdown exposes 5 curated options (reasoning models excluded). See [decisions.md](../../decisions.md#critic-gate-travel-agent-context--models-follow-up).
|
||||
|
||||
## Multi-Stop Context Derivation
|
||||
Chat context derives from the **full collection itinerary**, not just the first location.
|
||||
|
||||
### Frontend - `deriveCollectionDestination()`
|
||||
- Located in `frontend/src/routes/collections/[id]/+page.svelte`.
|
||||
- Extracts unique city/country pairs from `collection.locations`.
|
||||
- Capped at 4 stops, semicolon-joined, with `+N more` overflow suffix.
|
||||
- Passed to `AITravelChat` as `destination` prop.
|
||||
|
||||
### Backend - `send_message()` itinerary enrichment
|
||||
- `backend/server/chat/views/__init__.py` `send_message()` reads `collection.locations` and injects `Itinerary stops:` into the system prompt `## Trip Context` section.
|
||||
- Up to 8 unique stops; deduplication and blank-entry filtering applied.
|
||||
|
||||
### System prompt - trip-level reasoning
|
||||
- `get_system_prompt()` includes guidance to treat collection chats as itinerary-wide and call `get_trip_details` before `search_places`.
|
||||
|
||||
## Itinerary-Centric Quick Prompts
|
||||
- Quick-action buttons use `promptTripContext` (reactive: `collectionName || destination || ''`) instead of raw `destination`.
|
||||
- Guard changed from `{#if destination}` to `{#if promptTripContext}`.
|
||||
- Prompt wording uses `across my ${promptTripContext} itinerary?`.
|
||||
|
||||
## search_places Tool Output Key Convention
|
||||
- Backend `agent_tools.py` `search_places()` returns `{"location": ..., "category": ..., "results": [...]}`.
|
||||
- Frontend must use `.results` key (not `.places`).
|
||||
- **Historical bug**: Prior code used `.places` causing place cards to never render. Fixed 2026-03-09.
|
||||
|
||||
## Agent Tools Architecture
|
||||
|
||||
### Registered Tools
|
||||
| Tool name | Purpose | Required params |
|
||||
|---|---|---|
|
||||
| `search_places` | Nominatim geocode -> Overpass PoI search | `location` |
|
||||
| `web_search` | DuckDuckGo web search for current travel info | `query` |
|
||||
| `list_trips` | List user's collections | (none) |
|
||||
| `get_trip_details` | Full collection detail with itinerary | `collection_id` |
|
||||
| `add_to_itinerary` | Create Location + CollectionItineraryItem | `collection_id`, `name`, `lat`, `lon` |
|
||||
| `get_weather` | Open-Meteo archive + forecast | `latitude`, `longitude`, `dates` |
|
||||
|
||||
### Registry pattern
|
||||
- `@agent_tool(name, description, parameters)` decorator registers function references and generates OpenAI/LiteLLM-compatible tool schemas.
|
||||
- `execute_tool(tool_name, user, **kwargs)` resolves from registry and filters kwargs via `inspect.signature(...)`.
|
||||
- Extensibility: adding a new tool only requires defining a decorated function.
|
||||
|
||||
### Function signature convention
|
||||
All tool functions: `def tool_name(user, **kwargs) -> dict`. Return `{"error": "..."}` on failure; never raise.
|
||||
|
||||
### Web Search Tool
|
||||
- Uses `duckduckgo_search.DDGS().text(..., max_results=5)`.
|
||||
- Error handling includes import fallback, rate-limit guard, and generic failure logging.
|
||||
- Dependency: `duckduckgo-search>=4.0.0` in `backend/server/requirements.txt`.
|
||||
|
||||
## Backend Chat Endpoint Architecture
|
||||
|
||||
### URL Routing
|
||||
- `backend/server/main/urls.py`: `path("api/chat/", include("chat.urls"))`
|
||||
- `backend/server/chat/urls.py`: DRF `DefaultRouter` registers `conversations/` -> `ChatViewSet`, `providers/` -> `ChatProviderCatalogViewSet`
|
||||
- Manual paths: `POST /api/chat/suggestions/day/` -> `DaySuggestionsView`, `GET /api/chat/capabilities/` -> `CapabilitiesView`
|
||||
|
||||
### ChatViewSet Pattern
|
||||
- All actions: `permission_classes = [IsAuthenticated]`
|
||||
- Streaming response uses `StreamingHttpResponse(content_type="text/event-stream")`
|
||||
- SSE chunk format: `data: {json}\n\n`; terminal `data: [DONE]\n\n`
|
||||
- Tool loop: up to `MAX_TOOL_ITERATIONS = 10` rounds
|
||||
|
||||
### Day Suggestions Endpoint
|
||||
- `POST /api/chat/suggestions/day/` via `chat/views/day_suggestions.py`
|
||||
- Non-streaming JSON response
|
||||
- Inputs: `collection_id`, `date`, `category`, `filters`, `location_context`
|
||||
- Provider/model resolution via `_resolve_provider_and_model()`: request payload → `UserAISettings` defaults → instance defaults (`VOYAGE_AI_PROVIDER`/`VOYAGE_AI_MODEL`) → provider config default. No hardcoded OpenAI fallback.
|
||||
- Cross-provider model guard: `preferred_model` only applied when provider matches `preferred_provider`.
|
||||
- LLM call via `litellm.completion` with regex JSON extraction fallback
|
||||
- Suggestion normalization: frontend `normalizeSuggestionItem()` handles LLM response variants (title/place_name/venue, summary/details, address/neighborhood). Items without resolvable name are dropped.
|
||||
- Add-to-itinerary: `buildLocationPayload()` constructs `LocationSerializer`-compatible payload (name/location/description/rating/collections/is_public) from normalized suggestion.
|
||||
|
||||
### Capabilities Endpoint
|
||||
- `GET /api/chat/capabilities/` returns `{ "tools": [{ "name", "description" }, ...] }` from registry
|
||||
|
||||
## WS4-F4 Chat UI Rendering
|
||||
- Travel-themed header (icon: airplane, title: `Travel Assistant` with optional collection name suffix)
|
||||
- `ChatMessage` type supports `tool_results?: Array<{ name, result }>` for inline tool output
|
||||
- SSE handling appends to current assistant message's `tool_results` array
|
||||
- Renderer: `search_places` -> place cards, `web_search` -> linked cards, fallback -> JSON `<pre>`
|
||||
|
||||
## WS4-F3 Add-to-itinerary from Chat
|
||||
- `search_places` card results can be added directly to itinerary when collection context exists
|
||||
- Flow: date selector modal -> `POST /api/locations/` -> `POST /api/itineraries/` -> `itemAdded` event
|
||||
- Coordinate guard (`hasPlaceCoordinates`) required
|
||||
65
.memory/knowledge/tech-stack.md
Normal file
65
.memory/knowledge/tech-stack.md
Normal file
@@ -0,0 +1,65 @@
|
||||
# Tech Stack & Development
|
||||
|
||||
## Stack
|
||||
- **Frontend**: SvelteKit 2, TypeScript, Bun (package manager), DaisyUI + Tailwind CSS, svelte-i18n, svelte-maplibre
|
||||
- **Backend**: Django REST Framework, Python, django-allauth, djmoney, django-geojson, LiteLLM, duckduckgo-search
|
||||
- **Database**: PostgreSQL + PostGIS
|
||||
- **Cache**: Memcached
|
||||
- **Infrastructure**: Docker, Docker Compose
|
||||
- **Repo**: github.com/Alex-Wiesner/voyage
|
||||
- **License**: GNU GPL v3.0
|
||||
|
||||
## Development Commands
|
||||
|
||||
### Frontend (prefer Bun)
|
||||
- `cd frontend && bun run format` — fix formatting (6s)
|
||||
- `cd frontend && bun run lint` — check formatting (6s)
|
||||
- `cd frontend && bun run check` — Svelte type checking (12s; 0 errors, 6 warnings expected)
|
||||
- `cd frontend && bun run build` — build (32s)
|
||||
- `cd frontend && bun install` — install deps (45s)
|
||||
|
||||
### Backend (Docker required; uv for local Python tooling)
|
||||
- `docker compose exec server python3 manage.py test` — run tests (7s; 6/30 pre-existing failures expected)
|
||||
- `docker compose exec server python3 manage.py migrate` — run migrations
|
||||
|
||||
### Pre-Commit Checklist
|
||||
1. `cd frontend && bun run format`
|
||||
2. `cd frontend && bun run lint`
|
||||
3. `cd frontend && bun run check`
|
||||
4. `cd frontend && bun run build`
|
||||
|
||||
## Environment & Configuration
|
||||
|
||||
### .env Loading
|
||||
- **Library**: `python-dotenv==1.2.2` (in `backend/server/requirements.txt`)
|
||||
- **Entry point**: `backend/server/main/settings.py` calls `load_dotenv()` at module top
|
||||
- **Docker**: `docker-compose.yml` sets `env_file: .env` on all services — single root `.env` file shared
|
||||
- **Root `.env`**: `/home/alex/projects/voyage/.env` — canonical for Docker Compose setups
|
||||
|
||||
### Settings File
|
||||
- **Single file**: `backend/server/main/settings.py` (no split/environment-specific settings files)
|
||||
|
||||
### Server-side Env Vars (from `settings.py`)
|
||||
| Var | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `SECRET_KEY` | (required) | Django secret key |
|
||||
| `GOOGLE_MAPS_API_KEY` | `""` | Google Maps integration |
|
||||
| `STRAVA_CLIENT_ID` / `STRAVA_CLIENT_SECRET` | `""` | Strava OAuth |
|
||||
| `FIELD_ENCRYPTION_KEY` | `""` | Fernet key for `UserAPIKey` encryption |
|
||||
| `OSRM_BASE_URL` | `"https://router.project-osrm.org"` | Routing service |
|
||||
| `VOYAGE_AI_PROVIDER` | `"openai"` | Instance-level default AI provider |
|
||||
| `VOYAGE_AI_MODEL` | `"gpt-4o-mini"` | Instance-level default AI model |
|
||||
| `VOYAGE_AI_API_KEY` | `""` | Instance-level AI API key |
|
||||
|
||||
### Per-User LLM API Key Pattern
|
||||
LLM provider keys stored per-user in DB (`UserAPIKey` model, `integrations/models.py`):
|
||||
- `UserAPIKey` table: `(user, provider)` unique pair → `encrypted_api_key` (Fernet-encrypted text field)
|
||||
- `FIELD_ENCRYPTION_KEY` env var required for encrypt/decrypt
|
||||
- `llm_client.get_llm_api_key(user, provider)` → user key → instance key fallback (matching provider only) → `None`
|
||||
- No global server-side LLM API keys — every user must configure their own per-provider key via Settings UI (or instance admin configures fallback)
|
||||
|
||||
## Known Issues
|
||||
- Docker dev setup has frontend-backend communication issues (500 errors beyond homepage)
|
||||
- Frontend check: 0 errors, 6 warnings expected (pre-existing in `CollectionRecommendationView.svelte` + `RegionCard.svelte`)
|
||||
- Backend tests: 6/30 pre-existing failures (2 user email key errors + 4 geocoding API mocks)
|
||||
- Local Python pip install fails (network timeouts) — use Docker
|
||||
83
.memory/manifest.yaml
Normal file
83
.memory/manifest.yaml
Normal file
@@ -0,0 +1,83 @@
|
||||
name: voyage
|
||||
version: 1
|
||||
categories:
|
||||
- path: system.md
|
||||
description: One-paragraph project overview — purpose, stack, status
|
||||
group: knowledge
|
||||
|
||||
- path: knowledge/overview.md
|
||||
description: Architecture overview — API proxy, AI chat, services, auth, file locations
|
||||
group: knowledge
|
||||
|
||||
- path: knowledge/tech-stack.md
|
||||
description: Stack details, dev commands, environment config, env vars, known issues
|
||||
group: knowledge
|
||||
|
||||
- path: knowledge/conventions.md
|
||||
description: Coding conventions — frontend/backend patterns, workflow rules
|
||||
group: knowledge
|
||||
|
||||
- path: knowledge/patterns/chat-and-llm.md
|
||||
description: Chat & LLM patterns — model override, error mapping, agent tools, OpenCode Zen, context derivation
|
||||
group: knowledge
|
||||
|
||||
- path: knowledge/domain/collections-and-sharing.md
|
||||
description: Collections domain — sharing architecture, itinerary planner, user preferences
|
||||
group: knowledge
|
||||
|
||||
- path: knowledge/domain/ai-configuration.md
|
||||
description: AI configuration domain — WS1 infrastructure, provider catalog, frontend gaps
|
||||
group: knowledge
|
||||
|
||||
- path: decisions.md
|
||||
description: Architecture Decision Records — fork rationale, tooling choices, review/test verdicts, critic gates
|
||||
group: decisions
|
||||
|
||||
- path: plans/ai-travel-agent-collections-integration.md
|
||||
description: Plan for AI travel chat embedding in Collections + provider catalog (original integration)
|
||||
group: plans
|
||||
|
||||
- path: plans/opencode-zen-connection-error.md
|
||||
description: Plan for OpenCode Zen connection fix — model default change, error surfacing, model selection UI
|
||||
group: plans
|
||||
|
||||
- path: plans/ai-travel-agent-redesign.md
|
||||
description: Plan for AI travel agent redesign — WS1-WS6 workstreams (config, preferences, suggestions, chat UI, web search, extensibility)
|
||||
group: plans
|
||||
|
||||
- path: plans/travel-agent-context-and-models.md
|
||||
description: "Plan for follow-up fixes: F1 model dropdown expansion, F2 multi-stop context, F3 itinerary-centric prompts + .results key fix (COMPLETE)"
|
||||
group: plans
|
||||
|
||||
- path: plans/pre-release-and-memory-migration.md
|
||||
description: Plan for pre-release policy addition + .memory structure migration
|
||||
group: plans
|
||||
|
||||
- path: research/litellm-zen-provider-catalog.md
|
||||
description: Research on LiteLLM provider catalog and OpenCode Zen API compatibility
|
||||
group: research
|
||||
|
||||
- path: research/opencode-zen-connection-debug.md
|
||||
description: Debug findings for OpenCode Zen connection errors and model routing
|
||||
group: research
|
||||
|
||||
- path: research/auto-learn-preference-signals.md
|
||||
description: Research on auto-learning user travel preference signals from history data
|
||||
group: research
|
||||
|
||||
- path: research/provider-strategy.md
|
||||
description: "Research: multi-provider strategy — LiteLLM hardening vs replacement, retry/fallback patterns"
|
||||
group: research
|
||||
|
||||
- path: sessions/continuity.md
|
||||
description: Rolling session continuity notes — last session context, active work
|
||||
group: sessions
|
||||
|
||||
- path: plans/chat-provider-fixes.md
|
||||
description: "Chat provider fixes plan (COMPLETE) — chat-loop-hardening, default-ai-settings, suggestion-add-flow workstreams with full review/test records"
|
||||
group: plans
|
||||
|
||||
# Deprecated (content migrated)
|
||||
- path: knowledge.md
|
||||
description: "DEPRECATED — migrated to knowledge/ nested structure. See knowledge/ files."
|
||||
group: knowledge
|
||||
0
.memory/plans/.gitkeep
Normal file
0
.memory/plans/.gitkeep
Normal file
108
.memory/plans/ai-travel-agent-collections-integration.md
Normal file
108
.memory/plans/ai-travel-agent-collections-integration.md
Normal file
@@ -0,0 +1,108 @@
|
||||
# Plan: AI travel agent in Collections Recommendations
|
||||
|
||||
## Clarified requirements
|
||||
- Move AI travel agent UX from standalone `/chat` tab/page into Collections → Recommendations.
|
||||
- Remove the existing `/chat` route (not keep/redirect).
|
||||
- Provider list should be dynamic and display all providers LiteLLM supports.
|
||||
- Ensure OpenCode Zen is supported as a provider.
|
||||
|
||||
## Execution prerequisites
|
||||
- In each worktree, run `cd frontend && npm install` before implementation to ensure node modules (including `@mdi/js`) are present and baseline build can run.
|
||||
|
||||
## Decomposition (approved by user)
|
||||
|
||||
### Workstream 1 — Collections recommendations chat integration (Frontend + route cleanup)
|
||||
- **Worktree**: `.worktrees/collections-ai-agent`
|
||||
- **Branch**: `feat/collections-ai-agent`
|
||||
- **Risk**: Medium
|
||||
- **Quality tier**: Tier 2
|
||||
- **Task WS1-F1**: Embed AI chat experience inside Collections Recommendations UI.
|
||||
- **Acceptance criteria**:
|
||||
- Chat UI is available from Collections Recommendations section.
|
||||
- Existing recommendations functionality remains usable.
|
||||
- Chat interactions continue to work with existing backend chat APIs.
|
||||
- **Task WS1-F2**: Remove standalone `/chat` route/page.
|
||||
- **Acceptance criteria**:
|
||||
- `/chat` page is removed from app routes/navigation.
|
||||
- No broken imports/navigation links remain.
|
||||
|
||||
### Workstream 2 — Provider catalog + Zen provider support (Backend + frontend settings/chat)
|
||||
- **Worktree**: `.worktrees/litellm-provider-catalog`
|
||||
- **Branch**: `feat/litellm-provider-catalog`
|
||||
- **Risk**: Medium
|
||||
- **Quality tier**: Tier 2 (promote to Tier 1 if auth/secret handling changes)
|
||||
- **Task WS2-F1**: Implement dynamic provider listing based on LiteLLM-supported providers.
|
||||
- **Acceptance criteria**:
|
||||
- Backend exposes `GET /api/chat/providers/` using LiteLLM runtime provider list as source data.
|
||||
- Frontend provider selectors consume backend provider catalog rather than hardcoded arrays.
|
||||
- UI displays all LiteLLM provider IDs and metadata; non-chat-compatible providers are labeled unavailable.
|
||||
- Existing saved provider/API-key flows still function.
|
||||
- **Task WS2-F2**: Add/confirm OpenCode Zen provider support end-to-end.
|
||||
- **Acceptance criteria**:
|
||||
- OpenCode Zen appears as provider id `opencode_zen`.
|
||||
- Backend model resolution and API-key lookup work for `opencode_zen`.
|
||||
- Zen calls use LiteLLM OpenAI-compatible routing with `api_base=https://opencode.ai/zen/v1`.
|
||||
- Chat requests using Zen provider are accepted without fallback/validation failures.
|
||||
|
||||
## Provider architecture decision
|
||||
- Backend provider catalog endpoint `GET /api/chat/providers/` is the single source of truth for UI provider options.
|
||||
- Endpoint response fields: `id`, `label`, `available_for_chat`, `needs_api_key`, `default_model`, `api_base`.
|
||||
- All LiteLLM runtime providers are returned; entries without model mapping are `available_for_chat=false`.
|
||||
- Chat send path only accepts providers where `available_for_chat=true`.
|
||||
|
||||
## Research findings (2026-03-08)
|
||||
- LiteLLM provider enumeration is available at runtime (`litellm.provider_list`), currently 128 providers in this environment.
|
||||
- OpenCode Zen is not a native LiteLLM provider alias; support should be implemented via OpenAI-compatible provider config and explicit `api_base`.
|
||||
- Existing hardcoded provider duplication (backend + chat page + settings page) will be replaced by backend catalog consumption.
|
||||
- Reference: [LiteLLM + Zen provider research](../research/litellm-zen-provider-catalog.md)
|
||||
|
||||
## Dependencies
|
||||
- WS1 depends on existing chat API endpoint behavior and event streaming contract.
|
||||
- WS2 depends on LiteLLM provider metadata/query capabilities and provider-catalog endpoint design.
|
||||
- WS1-F1 depends on WS2 completion for dynamic provider selector integration.
|
||||
- WS1-F2 depends on WS1-F1 completion.
|
||||
|
||||
## Human checkpoints
|
||||
- No checkpoint required: Zen support path uses existing LiteLLM dependency via OpenAI-compatible API (no new SDK/service).
|
||||
|
||||
## Findings tracker
|
||||
- WS1-F1 implemented in worktree `.worktrees/collections-ai-agent`:
|
||||
- Extracted chat route UI into reusable component `frontend/src/lib/components/AITravelChat.svelte`, preserving conversation list, message stream rendering, provider selector, conversation CRUD, and SSE send-message flow via `/api/chat/conversations/*`.
|
||||
- Updated `frontend/src/routes/chat/+page.svelte` to render the reusable component so existing `/chat` behavior remains intact for WS1-F1 scope (WS1-F2 route removal deferred).
|
||||
- Embedded `AITravelChat` into Collections Recommendations view in `frontend/src/routes/collections/[id]/+page.svelte` above `CollectionRecommendationView`, keeping existing recommendation search/map/create flows unchanged.
|
||||
- Reviewer warning resolved: removed redundant outer card wrapper around `AITravelChat` in Collections Recommendations embedding, eliminating nested card-in-card styling while preserving spacing and recommendations placement.
|
||||
- WS1-F2 implemented in worktree `.worktrees/collections-ai-agent`:
|
||||
- Removed standalone chat route page by deleting `frontend/src/routes/chat/+page.svelte`.
|
||||
- Removed `/chat` navigation item from `frontend/src/lib/components/Navbar.svelte`, including the now-unused `mdiRobotOutline` icon import.
|
||||
- Verified embedded chat remains in Collections Recommendations via `AITravelChat` usage in `frontend/src/routes/collections/[id]/+page.svelte`; no remaining `/chat` route links/imports in `frontend/src`.
|
||||
- WS2-F1 implemented in worktree `.worktrees/litellm-provider-catalog`:
|
||||
- Added backend provider catalog endpoint `GET /api/chat/providers/` from `litellm.provider_list` with response fields `id`, `label`, `available_for_chat`, `needs_api_key`, `default_model`, `api_base`.
|
||||
- Refactored chat provider model map into `CHAT_PROVIDER_CONFIG` in `backend/server/chat/llm_client.py` and reused it for both send-message routing and provider catalog metadata.
|
||||
- Updated chat/settings frontend provider consumers to fetch provider catalog dynamically and removed hardcoded provider arrays.
|
||||
- Chat UI now restricts provider selection/sending to `available_for_chat=true`; settings API key UI now lists full provider catalog (including unavailable-for-chat entries).
|
||||
- WS2-F1 reviewer carry-forward fixes applied:
|
||||
- Fixed chat provider selection fallback timing in `frontend/src/routes/chat/+page.svelte` by computing `availableProviders` from local `catalog` response data instead of relying on reactive `chatProviders` immediately after assignment.
|
||||
- Applied low-risk settings improvement in `frontend/src/routes/settings/+page.svelte` by changing `await loadProviderCatalog()` to `void loadProviderCatalog()` in the second `onMount`, preventing provider fetch from delaying success toast logic.
|
||||
- WS2-F2 implemented in worktree `.worktrees/litellm-provider-catalog`:
|
||||
- Added `opencode_zen` to `CHAT_PROVIDER_CONFIG` in `backend/server/chat/llm_client.py` with label `OpenCode Zen`, `needs_api_key=true`, `default_model=openai/gpt-4o-mini`, and `api_base=https://opencode.ai/zen/v1`.
|
||||
- Updated `get_provider_catalog()` to append configured chat providers not present in `litellm.provider_list`, ensuring OpenCode Zen appears in `GET /api/chat/providers/` even though it is an OpenAI-compatible alias rather than a native LiteLLM provider id.
|
||||
- Normalized provider IDs in `get_llm_api_key()` and `stream_chat_completion()` via `_normalize_provider_id()` to keep API-key lookup and LLM request routing consistent for `opencode_zen`.
|
||||
- Consolidation completed in worktree `.worktrees/collections-ai-agent`:
|
||||
- Ported WS2 provider-catalog backend to `backend/server/chat` in the collections branch, including `GET /api/chat/providers/`, `CHAT_PROVIDER_CONFIG` metadata fields (`label`, `needs_api_key`, `default_model`, `api_base`), and chat-send validation to allow only `available_for_chat` providers.
|
||||
- Confirmed `opencode_zen` support in consolidated branch with `label=OpenCode Zen`, `default_model=openai/gpt-4o-mini`, `api_base=https://opencode.ai/zen/v1`, and API-key-required behavior.
|
||||
- Replaced hardcoded providers in `frontend/src/lib/components/AITravelChat.svelte` with dynamic `/api/chat/providers/` loading, preserving send guard to chat-available providers only.
|
||||
- Updated settings API-key provider dropdown in `frontend/src/routes/settings/+page.svelte` to load full provider catalog dynamically and added `ChatProviderCatalogEntry` type in `frontend/src/lib/types.ts`.
|
||||
- Preserved existing collections chat embedding and kept standalone `/chat` route removed (no route reintroduction in consolidation changes).
|
||||
|
||||
## Retry tracker
|
||||
- WS1-F1: 0
|
||||
- WS1-F2: 0
|
||||
- WS2-F1: 0
|
||||
- WS2-F2: 0
|
||||
|
||||
## Execution checklist
|
||||
- [x] WS2-F1 Dynamic provider listing from LiteLLM (Tier 2)
|
||||
- [x] WS2-F2 OpenCode Zen provider support (Tier 2)
|
||||
- [x] WS1-F1 Embed AI chat into Collections Recommendations (Tier 2)
|
||||
- [x] WS1-F2 Remove standalone `/chat` route (Tier 2)
|
||||
- [x] Documentation coverage + knowledge sync (Librarian)
|
||||
338
.memory/plans/ai-travel-agent-redesign.md
Normal file
338
.memory/plans/ai-travel-agent-redesign.md
Normal file
@@ -0,0 +1,338 @@
|
||||
# AI Travel Agent Redesign Plan
|
||||
|
||||
## Vision Summary
|
||||
|
||||
Redesign the AI travel agent with two context-aware entry points, user preference learning, flexible provider configuration, extensibility for future integrations, web search capability, and multi-user collection support.
|
||||
|
||||
---
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ ENTRY POINTS │
|
||||
├─────────────────────────────┬───────────────────────────────────┤
|
||||
│ Day-Level Suggestions │ Collection-Level Chat │
|
||||
│ (new modal) │ (improved Recommendations tab) │
|
||||
│ - Category filters │ - Context-aware │
|
||||
│ - Sub-filters │ - Add to itinerary actions │
|
||||
│ - Add to day action │ │
|
||||
└─────────────────────────────┴───────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ AGENT CORE │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ - LiteLLM backend (streaming SSE) │
|
||||
│ - Tool calling (place search, web search, itinerary actions) │
|
||||
│ - Multi-user preference aggregation │
|
||||
│ - Context injection (collection, dates, location) │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ CONFIGURATION LAYERS │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Instance (.env) → VOYAGE_AI_PROVIDER │
|
||||
│ → VOYAGE_AI_MODEL │
|
||||
│ → VOYAGE_AI_API_KEY │
|
||||
│ User (DB) → UserAPIKey.per-provider keys │
|
||||
│ → UserAISettings.model preference │
|
||||
│ Fallback: User key → Instance key → Error │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Workstreams
|
||||
|
||||
### WS1: Configuration Infrastructure
|
||||
|
||||
**Goal**: Support both instance-level and user-level provider/model configuration with proper fallback.
|
||||
|
||||
#### WS1-F1: Instance-level configuration
|
||||
- Add env vars to `settings.py`:
|
||||
- `VOYAGE_AI_PROVIDER` (default: `openai`)
|
||||
- `VOYAGE_AI_MODEL` (default: `gpt-4o-mini`)
|
||||
- `VOYAGE_AI_API_KEY` (optional global key)
|
||||
- Update `llm_client.py` to read instance defaults
|
||||
- Add fallback chain: user key → instance key → error
|
||||
|
||||
#### WS1-F2: User-level model preferences
|
||||
- Add `UserAISettings` model (OneToOne → CustomUser):
|
||||
- `preferred_provider` (CharField)
|
||||
- `preferred_model` (CharField)
|
||||
- Create API endpoint: `POST /api/ai/settings/`
|
||||
- Add UI in Settings → AI section for model selection
|
||||
|
||||
#### WS1-F3: Provider catalog enhancement
|
||||
- Extend provider catalog response to include:
|
||||
- `instance_configured`: bool (has instance key)
|
||||
- `user_configured`: bool (has user key)
|
||||
- Update frontend to show configuration status per provider
|
||||
|
||||
**Files**: `settings.py`, `llm_client.py`, `integrations/models.py`, `integrations/views/`, `frontend/src/routes/settings/`
|
||||
|
||||
---
|
||||
|
||||
### WS2: User Preference Learning
|
||||
|
||||
**Goal**: Capture and use user preferences in AI recommendations.
|
||||
|
||||
#### WS2-F1: Preference UI
|
||||
- Add "AI Preferences" tab to Settings page
|
||||
- Form fields: cuisines, interests, trip_style, notes
|
||||
- Use tag input for cuisines/interests (better UX than free text)
|
||||
- Connect to existing `/api/integrations/recommendation-preferences/`
|
||||
|
||||
Implementation notes (2026-03-08):
|
||||
- Implemented in `frontend/src/routes/settings/+page.svelte` as `travel_preferences` section in the existing settings sidebar, with `savePreferences(event)` posting to `/api/integrations/recommendation-preferences/`.
|
||||
- `interests` conversion is string↔array at UI boundary: load via `(profile.interests || []).join(', ')`; save via `.split(',').map((s) => s.trim()).filter(Boolean)`.
|
||||
- SSR preload added in `frontend/src/routes/settings/+page.server.ts` using parallel fetch with API keys; returns `props.recommendationProfile` as first list element or `null`.
|
||||
- Frontend typing added in `frontend/src/lib/types.ts` (`UserRecommendationPreferenceProfile`) and i18n strings added under `settings` in `frontend/src/locales/en.json`.
|
||||
- See backend capability reference in [Project Knowledge — User Recommendation Preference Profile](../knowledge.md#user-recommendation-preference-profile).
|
||||
|
||||
#### WS2-F2: Preference injection
|
||||
- Enhance `get_system_prompt()` to format preferences better
|
||||
- Add preference summary in system prompt (structured, not just appended)
|
||||
|
||||
#### WS2-F3: Multi-user aggregation
|
||||
- New function: `get_aggregated_preferences(collection)`
|
||||
- Returns combined preferences from all `collection.shared_with` users + owner
|
||||
- Format: "Party preferences: User A likes X, User B prefers Y..."
|
||||
- Inject into system prompt for shared collections
|
||||
|
||||
**Files**: `frontend/src/routes/settings/`, `chat/llm_client.py`, `integrations/models.py`
|
||||
|
||||
---
|
||||
|
||||
### WS3: Day-Level Suggestions Modal
|
||||
|
||||
**Goal**: Add "Suggest" option to itinerary day "Add" dropdown with category filters.
|
||||
|
||||
#### WS3-F1: Suggestion modal component
|
||||
- Create `ItinerarySuggestionModal.svelte`
|
||||
- Two-step flow:
|
||||
1. **Category selection**: Restaurant, Activity, Event, Lodging
|
||||
2. **Filter refinement**:
|
||||
- Restaurant: cuisine type, price range, dietary restrictions
|
||||
- Activity: type (outdoor, cultural, etc.), duration
|
||||
- Event: type, date/time preference
|
||||
- Lodging: type, amenities
|
||||
- "Any/Surprise me" option for each filter
|
||||
|
||||
#### WS3-F2: Add button integration
|
||||
- Add "Get AI suggestions" option to `CollectionItineraryPlanner.svelte` Add dropdown
|
||||
- Opens suggestion modal with target date pre-set
|
||||
- Modal receives: `collectionId`, `targetDate`, `collectionLocation` (for context)
|
||||
|
||||
#### WS3-F3: Suggestion results display
|
||||
- Show 3-5 suggestions as cards with:
|
||||
- Name, description, why it fits preferences
|
||||
- "Add to this day" button
|
||||
- "Add to different day" option
|
||||
- On add: **direct REST API call** to `/api/itineraries/` (not agent tool)
|
||||
- User must approve each item individually - no bulk/auto-add
|
||||
- Close modal and refresh itinerary on success
|
||||
|
||||
#### WS3-F4: Backend suggestion endpoint
|
||||
- New endpoint: `POST /api/ai/suggestions/day/`
|
||||
- Params: `collection_id`, `date`, `category`, `filters`, `location_context`
|
||||
- Returns structured suggestions (not chat, direct JSON)
|
||||
- Uses agent internally but returns parsed results
|
||||
|
||||
**Files**: `CollectionItineraryPlanner.svelte`, `ItinerarySuggestionModal.svelte` (new), `chat/views.py`, `chat/agent_tools.py`
|
||||
|
||||
---
|
||||
|
||||
### WS3.5: Insertion Flow Clarification
|
||||
|
||||
**Two insertion paths exist:**
|
||||
|
||||
| Path | Entry Point | Mechanism | Use Case |
|
||||
|------|-------------|-----------|----------|
|
||||
| **User-approved** | Suggestions modal | Direct REST API call to `/api/itineraries/` | Day-level suggestions, user reviews and clicks Add |
|
||||
| **Agent-initiated** | Chat (Recommendations tab) | `add_to_itinerary` tool via SSE streaming | Conversational adds when user says "add that place" |
|
||||
|
||||
**Why two paths:**
|
||||
- Modal: Faster, simpler UX - no round-trip through agent, user stays in control
|
||||
- Chat: Natural conversation flow - agent can add as part of dialogue
|
||||
|
||||
**No changes needed to agent tools** - `add_to_itinerary` already exists in `agent_tools.py` and works for chat-initiated adds.
|
||||
|
||||
---
|
||||
|
||||
### WS4: Collection-Level Chat Improvements
|
||||
|
||||
**Goal**: Make Recommendations tab chat context-aware and action-capable.
|
||||
|
||||
#### WS4-F1: Context injection
|
||||
- Pass collection context to `AITravelChat.svelte`:
|
||||
- `collectionId`, `collectionName`, `startDate`, `endDate`
|
||||
- `destination` (from collection locations or user input)
|
||||
- Inject into system prompt: "You are helping plan a trip to X from Y to Z"
|
||||
|
||||
Implementation notes (2026-03-08):
|
||||
- `frontend/src/lib/components/AITravelChat.svelte` now exposes optional context props (`collectionId`, `collectionName`, `startDate`, `endDate`, `destination`) and includes them in `POST /api/chat/conversations/{id}/send_message/` payload.
|
||||
- `frontend/src/routes/collections/[id]/+page.svelte` now passes collection context into `AITravelChat`; destination is derived via `deriveCollectionDestination(...)` from `city/country/location/name` on the first usable location.
|
||||
- `backend/server/chat/views/__init__.py::ChatViewSet.send_message()` now accepts the same optional fields, resolves `collection_id` (owner/shared access only), and appends a `## Trip Context` block to the system prompt before streaming.
|
||||
- Related architecture note: [Project Knowledge — AI Chat](../knowledge.md#ai-chat-collections--recommendations).
|
||||
|
||||
#### WS4-F2: Quick action buttons
|
||||
- Add preset prompts above chat input:
|
||||
- "Suggest restaurants for this trip"
|
||||
- "Find activities near [destination]"
|
||||
- "What should I pack for [dates]?"
|
||||
- Pre-fill input on click
|
||||
|
||||
#### WS4-F3: Add-to-itinerary from chat
|
||||
- When agent suggests a place, show "Add to itinerary" button
|
||||
- User selects date → calls `add_to_itinerary` tool
|
||||
- Visual feedback on success
|
||||
|
||||
Implementation notes (2026-03-08):
|
||||
- Implemented in `frontend/src/lib/components/AITravelChat.svelte` as an MVP direct frontend flow (no agent round-trip):
|
||||
- Adds `Add to Itinerary` button to `search_places` result cards when `collectionId` exists.
|
||||
- Opens a date picker modal (`showDateSelector`, `selectedPlaceToAdd`, `selectedDate`) constrained by trip date range (`min={startDate}`, `max={endDate}`).
|
||||
- On confirm, creates a location via `POST /api/locations/` then creates itinerary entry via `POST /api/itineraries/`.
|
||||
- Dispatches `itemAdded { locationId, date }` and shows success toast (`added_successfully`).
|
||||
- Guards against missing/invalid coordinates by disabling add action unless lat/lon parse successfully.
|
||||
- i18n keys added in `frontend/src/locales/en.json`: `add_to_itinerary`, `add_to_which_day`, `added_successfully`.
|
||||
|
||||
#### WS4-F4: Improved UI
|
||||
- Remove generic "robot" branding, use travel-themed design
|
||||
- Show collection name in header
|
||||
- Better tool result display (cards instead of raw JSON)
|
||||
|
||||
Implementation notes (2026-03-08):
|
||||
- `frontend/src/lib/components/AITravelChat.svelte` header now uses travel branding with `✈️` and renders `Travel Assistant · {collectionName}` when collection context is present; destination is shown as a subtitle when provided.
|
||||
- Robot icon usage in chat UI was replaced with travel-themed emoji (`✈️`, `🌍`, `🗺️`) while keeping existing layout structure.
|
||||
- SSE `tool_result` chunks are now attached to the in-flight assistant message via `tool_results` and rendered inline as structured cards for `search_places` and `web_search`, with JSON `<pre>` fallback for unknown tools.
|
||||
- Legacy persisted `role: 'tool'` messages are still supported via JSON parsing fallback and use the same card rendering logic.
|
||||
- i18n root keys added in `frontend/src/locales/en.json`: `travel_assistant`, `quick_actions`.
|
||||
|
||||
See [Project Knowledge — WS4-F4 Chat UI Rendering](../knowledge.md#ws4-f4-chat-ui-rendering).
|
||||
|
||||
**Files**: `AITravelChat.svelte`, `chat/views.py`, `chat/llm_client.py`
|
||||
|
||||
---
|
||||
|
||||
### WS5: Web Search Capability
|
||||
|
||||
**Goal**: Enable agent to search the web for current information.
|
||||
|
||||
#### WS5-F1: Web search tool
|
||||
- Add `web_search` tool to `agent_tools.py`:
|
||||
- Uses DuckDuckGo (free, no API key) or Brave Search API (env var)
|
||||
- Returns top 5 results with titles, snippets, URLs
|
||||
- Tool schema:
|
||||
```python
|
||||
{
|
||||
"name": "web_search",
|
||||
"description": "Search the web for current information about destinations, events, prices, etc.",
|
||||
"parameters": {
|
||||
"query": "string - search query",
|
||||
"location_context": "string - optional location to bias results"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### WS5-F2: Tool integration
|
||||
- Register in `AGENT_TOOLS` list
|
||||
- Add to `execute_tool()` dispatcher
|
||||
- Handle rate limiting gracefully
|
||||
|
||||
**Files**: `chat/agent_tools.py`, `requirements.txt` (add `duckduckgo-search`)
|
||||
|
||||
---
|
||||
|
||||
### WS6: Extensibility Architecture
|
||||
|
||||
**Goal**: Design for easy addition of future integrations.
|
||||
|
||||
#### WS6-F1: Plugin tool registry
|
||||
- Refactor `agent_tools.py` to use decorator-based registration:
|
||||
```python
|
||||
@agent_tool(name="web_search", description="...")
|
||||
def web_search(query: str, location_context: str = None):
|
||||
...
|
||||
```
|
||||
- Tools auto-register on import
|
||||
- Easy to add new tools in separate files
|
||||
|
||||
#### WS6-F2: Integration hooks
|
||||
- Create `chat/integrations/` directory for future:
|
||||
- `tripadvisor.py` - TripAdvisor API integration
|
||||
- `flights.py` - Flight search (Skyscanner, etc.)
|
||||
- `weather.py` - Enhanced weather data
|
||||
- Each integration exports tools via decorator
|
||||
|
||||
#### WS6-F3: Capability discovery
|
||||
- Endpoint: `GET /api/ai/capabilities/`
|
||||
- Returns list of available tools/integrations
|
||||
- Frontend can show "Powered by X, Y, Z" dynamically
|
||||
|
||||
**Files**: `chat/tools/` (new directory), `chat/agent_tools.py` (refactor)
|
||||
|
||||
---
|
||||
|
||||
## File Changes Summary
|
||||
|
||||
### New Files
|
||||
- `frontend/src/lib/components/collections/ItinerarySuggestionModal.svelte`
|
||||
- `backend/server/chat/tools/__init__.py`
|
||||
- `backend/server/chat/tools/web_search.py`
|
||||
- `backend/server/integrations/models.py` (add UserAISettings)
|
||||
- `backend/server/integrations/views/ai_settings_view.py`
|
||||
|
||||
### Modified Files
|
||||
- `backend/server/main/settings.py` - Add AI env vars
|
||||
- `backend/server/chat/llm_client.py` - Config fallback, preference aggregation
|
||||
- `backend/server/chat/views.py` - New suggestion endpoint, context injection
|
||||
- `backend/server/chat/agent_tools.py` - Web search tool, refactor
|
||||
- `frontend/src/lib/components/AITravelChat.svelte` - Context awareness, actions
|
||||
- `frontend/src/lib/components/collections/CollectionItineraryPlanner.svelte` - Add button
|
||||
- `frontend/src/routes/settings/+page.svelte` - AI preferences UI, model selection
|
||||
- `frontend/src/routes/collections/[id]/+page.svelte` - Pass collection context
|
||||
|
||||
---
|
||||
|
||||
## Migration Path
|
||||
|
||||
1. **Phase 1 - Foundation** (WS1, WS2)
|
||||
- Configuration infrastructure
|
||||
- Preference UI
|
||||
- No user-facing changes to chat yet
|
||||
|
||||
2. **Phase 2 - Day Suggestions** (WS3)
|
||||
- New modal, new entry point
|
||||
- Backend suggestion endpoint
|
||||
- Can ship independently
|
||||
|
||||
3. **Phase 3 - Chat Improvements** (WS4, WS5)
|
||||
- Context-aware chat
|
||||
- Web search capability
|
||||
- Better UX
|
||||
|
||||
4. **Phase 4 - Extensibility** (WS6)
|
||||
- Plugin architecture
|
||||
- Future integration prep
|
||||
|
||||
---
|
||||
|
||||
## Decisions (Confirmed)
|
||||
|
||||
| Decision | Choice | Rationale |
|
||||
|----------|--------|-----------|
|
||||
| Web search provider | **DuckDuckGo** | Free, no API key, good enough for travel info |
|
||||
| Suggestion API | **Dedicated REST endpoint** | Simpler, faster, returns JSON directly |
|
||||
| Multi-user conflicts | **List all preferences** | Transparency - AI navigates differing preferences |
|
||||
|
||||
---
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- WSGI→ASGI migration (keep current async-in-sync pattern)
|
||||
- Role-based permissions (all shared users have same access)
|
||||
- Real-time collaboration (WebSocket sync)
|
||||
- Mobile-specific optimizations
|
||||
248
.memory/plans/chat-provider-fixes.md
Normal file
248
.memory/plans/chat-provider-fixes.md
Normal file
@@ -0,0 +1,248 @@
|
||||
# Chat Provider Fixes
|
||||
|
||||
## Problem Statement
|
||||
The AI chat feature is broken with multiple issues:
|
||||
1. Rate limit errors from providers
|
||||
2. "location is required" errors (tool calling issue)
|
||||
3. "An unexpected error occurred while fetching trip details" errors
|
||||
4. Models not being fetched properly for all providers
|
||||
5. Potential authentication issues
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
### Issue 1: Tool Calling Errors
|
||||
The errors "location is required" and "An unexpected error occurred while fetching trip details" come from the agent tools (`search_places`, `get_trip_details`) being called with missing/invalid parameters. This suggests:
|
||||
- The LLM is not properly understanding the tool schemas
|
||||
- Or the model doesn't support function calling well
|
||||
- Or there's a mismatch between how LiteLLM formats tools and what the model expects
|
||||
|
||||
### Issue 2: Models Not Fetched
|
||||
The `models` endpoint in `ChatProviderCatalogViewSet` only handles:
|
||||
- `openai` - uses OpenAI SDK to fetch live
|
||||
- `anthropic/claude` - hardcoded list
|
||||
- `gemini/google` - hardcoded list
|
||||
- `groq` - hardcoded list
|
||||
- `ollama` - calls local API
|
||||
- `opencode_zen` - hardcoded list
|
||||
|
||||
All other providers return `{"models": []}`.
|
||||
|
||||
### Issue 3: Authentication Flow
|
||||
1. Frontend sends request with `credentials: 'include'`
|
||||
2. Backend gets user from session
|
||||
3. `get_llm_api_key()` checks `UserAPIKey` model for user's key
|
||||
4. Falls back to `settings.VOYAGE_AI_API_KEY` if user has no key and provider matches instance default
|
||||
5. Key is passed to LiteLLM's `acompletion()`
|
||||
|
||||
Potential issues:
|
||||
- Encryption key not configured correctly
|
||||
- Key not being passed correctly to LiteLLM
|
||||
- Provider-specific auth headers not being set
|
||||
|
||||
### Issue 4: LiteLLM vs Alternatives
|
||||
Current approach (LiteLLM):
|
||||
- Single library handles all providers
|
||||
- Normalizes API calls across providers
|
||||
- Built-in error handling and retries (if configured)
|
||||
|
||||
Alternative (Vercel AI SDK):
|
||||
- Provider registry pattern with individual packages
|
||||
- More explicit provider configuration
|
||||
- Better TypeScript support
|
||||
- But would require significant refactoring (backend is Python)
|
||||
|
||||
## Investigation Tasks
|
||||
|
||||
- [ ] Test the actual API calls to verify authentication
|
||||
- [x] Check if models endpoint returns correct data
|
||||
- [x] Verify tool schemas are being passed correctly
|
||||
- [ ] Test with a known-working model (e.g., GPT-4o)
|
||||
|
||||
## Options
|
||||
|
||||
### Option A: Fix LiteLLM Integration (Recommended)
|
||||
1. Add proper retry logic with `num_retries=2`
|
||||
2. Add `supports_function_calling()` check before using tools
|
||||
3. Expand models endpoint to handle more providers
|
||||
4. Add better logging for debugging
|
||||
|
||||
### Option B: Replace LiteLLM with Custom Implementation
|
||||
1. Use direct API calls per provider
|
||||
2. More control but more maintenance
|
||||
3. Significant development effort
|
||||
|
||||
### Option C: Hybrid Approach
|
||||
1. Keep LiteLLM for providers it handles well
|
||||
2. Add custom handlers for problematic providers
|
||||
3. Medium effort, best of both worlds
|
||||
|
||||
## Status
|
||||
|
||||
### Completed (2026-03-09)
|
||||
- [x] Implemented backend fixes for Option A:
|
||||
1. `ChatProviderCatalogViewSet.models()` now fetches OpenCode Zen models dynamically from `{api_base}/models` using the configured provider API base and user API key; returns deduplicated model ids and logs fetch failures.
|
||||
2. `stream_chat_completion()` now checks `litellm.supports_function_calling(model=resolved_model)` before sending tools and disables tools with a warning if unsupported.
|
||||
3. Added LiteLLM transient retry configuration via `num_retries=2` on streaming completions.
|
||||
4. Added request/error logging for provider/model/tool usage and API base/message count diagnostics.
|
||||
|
||||
### Verification Results
|
||||
- Models endpoint: Returns 36 models from OpenCode Zen API (was 5 hardcoded)
|
||||
- Function calling check: gpt-5-nano=True, claude-sonnet-4-6=True, big-pickle=False, minimax-m2.5=False
|
||||
- Syntax check: Passed for both modified files
|
||||
- Frontend check: 0 errors, 6 warnings (pre-existing)
|
||||
|
||||
### Remaining Issues (User Action Required)
|
||||
- Rate limits: Free tier has limits, user may need to upgrade or wait
|
||||
- Tool calling: Some models (big-pickle, minimax-m2.5) don't support function calling - tools will be disabled for these models
|
||||
|
||||
## Follow-up Fixes (2026-03-09)
|
||||
|
||||
### Clarified Behavior
|
||||
- Approved preference precedence: database-saved default provider/model beats any per-device `localStorage` override.
|
||||
- Requirement: user AI preferences must be persisted through the existing `UserAISettings` backend API and applied by both the settings UI and chat send-message fallback logic.
|
||||
|
||||
### Planned Workstreams
|
||||
|
||||
- [x] `chat-loop-hardening`
|
||||
- Acceptance: invalid required-argument tool calls do not loop repeatedly, tool-error messages are not replayed back into the model history, and SSE streams terminate cleanly with a user-visible error or `[DONE]`.
|
||||
- Files: `backend/server/chat/views/__init__.py`, `backend/server/chat/agent_tools.py`, optional `backend/server/chat/llm_client.py`
|
||||
- Notes: preserve successful tool flows; stop feeding `{"error": "location is required"}` / `{"error": "query is required"}` back into the next model turn.
|
||||
- Completion (2026-03-09): Added required-argument tool-error detection in `send_message()` streaming loop, short-circuited those tool failures with a user-visible SSE error + terminal `[DONE]`, skipped persistence/replay of those invalid tool payloads (including historical cleanup at `_build_llm_messages()`), and tightened `search_places`/`web_search` tool descriptions to explicitly call out required non-empty args.
|
||||
- Follow-up (2026-03-09): Fixed multi-tool-call consistency by persisting/replaying only the successful prefix of `tool_calls` when a later call fails required-arg validation; `_build_llm_messages()` now trims assistant `tool_calls` to only IDs that have kept (non-filtered) persisted tool messages.
|
||||
- Review verdict (2026-03-09): **APPROVED** (score 6). Two WARNINGs: (1) multi-tool-call orphan — when model returns N tool calls and call K fails required-param validation, calls 1..K-1 are already persisted but call K's result is not, leaving an orphaned `tool_calls` reference in the assistant message that may cause LLM API errors on the next conversation turn; (2) `_build_llm_messages` filters tool-role error messages but does not filter/trim the corresponding assistant-message `tool_calls` array, creating the same orphan on historical replay. Both are low-likelihood (multi-tool required-param failures are rare) and gracefully degraded (next-turn errors are caught by `_safe_error_payload`). One SUGGESTION: `get_weather` error `"dates must be a non-empty list"` does not match the `is/are required` regex and would not trigger the short-circuit (mitigated by `MAX_TOOL_ITERATIONS` guard). Also confirms prior pre-existing bug (`tool_iterations` never incremented) is now fixed in this changeset.
|
||||
|
||||
- [x] `default-ai-settings`
|
||||
- Acceptance: settings page shows default AI provider/model controls, saving persists via `UserAISettings`, chat UI initializes from saved preferences, and backend chat fallback uses saved defaults when request payload omits provider/model.
|
||||
- Files: `frontend/src/routes/settings/+page.server.ts`, `frontend/src/routes/settings/+page.svelte`, `frontend/src/lib/types.ts`, `frontend/src/lib/components/AITravelChat.svelte`, `backend/server/chat/views/__init__.py`
|
||||
- Notes: DB-saved defaults override browser-local model prefs.
|
||||
|
||||
### Completion Note (2026-03-09)
|
||||
- Implemented DB-backed default AI settings end-to-end: settings page now loads/saves `UserAISettings` via `/api/integrations/ai-settings/`, with provider/model selectors powered by provider catalog + per-provider models endpoint.
|
||||
- Chat initialization now treats saved DB defaults as authoritative initial provider/model; stale `voyage_chat_model_prefs` localStorage values no longer override defaults and are synchronized to the saved defaults.
|
||||
- Backend `send_message` now uses saved `UserAISettings` only when request payload omits provider/model, preserving explicit request values and existing provider validation behavior.
|
||||
- Follow-up fix: backend model fallback now only applies `preferred_model` when the resolved provider matches `preferred_provider`, preventing cross-provider default model mismatches when users explicitly choose another provider.
|
||||
|
||||
- [x] `suggestion-add-flow`
|
||||
- Acceptance: day suggestions use the user-configured/default provider/model instead of hardcoded OpenAI values, and adding a suggested place creates a location plus itinerary entry successfully.
|
||||
- Files: `backend/server/chat/views/day_suggestions.py`, `frontend/src/lib/components/collections/ItinerarySuggestionModal.svelte`
|
||||
- Notes: normalize suggestion payloads needed by `/api/locations/` and preserve existing add-item event wiring.
|
||||
- Completion (2026-03-09): Day suggestions now resolve provider/model in precedence order (request payload → `UserAISettings` defaults → instance/provider defaults) without OpenAI hardcoding; modal now normalizes suggestion objects and builds stable `/api/locations/` payloads (name/location/description/rating) before dispatching existing `addItem` flow.
|
||||
- Follow-up (2026-03-09): Removed remaining OpenAI-specific `gpt-4o-mini` fallback from day suggestions LLM call; endpoint now uses provider-resolved/default model only and fails safely when no model is configured.
|
||||
- Follow-up (2026-03-09): Removed unsupported `temperature` from day suggestions requests, normalized bare `opencode_zen` model ids through the gateway (`openai/<model>`), and switched day suggestions error responses to the same sanitized categories used by chat. Browser result: the suggestion modal now completes normally (empty-state or rate-limit message) instead of crashing with a generic 500.
|
||||
|
||||
## Tester Validation — `default-ai-settings` (2026-03-09)
|
||||
|
||||
### STATUS: PASS
|
||||
|
||||
**Evidence from lead:** Authenticated POST `/api/integrations/ai-settings/` returned 200 and persisted; subsequent GET returned same values; POST `/api/chat/conversations/{id}/send_message/` with no provider/model in body used `preferred_provider='opencode_zen'` and `preferred_model='gpt-5-nano'` from DB, producing valid SSE stream.
|
||||
|
||||
**Standard pass findings:**
|
||||
- `UserAISettings` model, serializer, and `UserAISettingsViewSet` are correct. Upsert logic in `perform_create` handles first-write and update-in-place correctly (single row per user via OneToOneField).
|
||||
- `list()` returns `[serializer.data]` (wrapped array), which the frontend expects as `settings[0]` — contract matches.
|
||||
- Backend `send_message` precedence: `requested_provider` → `preferred_provider` (if available) → `"openai"` fallback. `model` only inherits `preferred_model` when `provider == preferred_provider` — cross-provider default mismatch is correctly prevented (follow-up fix confirmed).
|
||||
- Settings page initializes `defaultAiProvider`/`defaultAiModel` from SSR-loaded `aiSettings` and validates against provider catalog on `onMount`. If saved provider is no longer configured, it falls back to first configured provider.
|
||||
- `AITravelChat.svelte` fetches AI settings on mount, applies as authoritative default, and writes to `localStorage` (sync direction is DB → localStorage, not the reverse).
|
||||
- The `send_message` handler in the frontend always sends the current UI `selectedProvider`/`selectedModel`, not localStorage values directly — these are only used for UI state initialization, not bypassing DB defaults.
|
||||
- All i18n keys present in `en.json`: `default_ai_settings_title`, `default_ai_settings_desc`, `default_ai_no_providers`, `default_ai_save`, `default_ai_settings_saved`, `default_ai_settings_error`, `default_ai_provider_required`.
|
||||
- Django integration tests (5/5) pass; no tests exist for `UserAISettings` specifically — residual regression risk noted.
|
||||
|
||||
**Adversarial pass findings (all hypotheses did not find bugs):**
|
||||
|
||||
1. **Hypothesis: model saved for provider A silently applied when user explicitly sends provider B (cross-provider model leak).** Checked `send_message` lines 218–220: `model = requested_model; if model is None and preferred_model and provider == preferred_provider: model = preferred_model`. When `requested_provider=B` and `preferred_provider=A`, `provider == preferred_provider` is false → `model` stays `None`. **Not vulnerable.**
|
||||
|
||||
2. **Hypothesis: null/empty preferred_model or preferred_provider in DB triggers error.** Serializer allows `null` on both fields (CharField with `blank=True, null=True`). Backend normalizes with `.strip().lower()` inside `(ai_settings.preferred_provider or "").strip().lower()` guard. Frontend uses `?? ''` coercion. **Handled safely.**
|
||||
|
||||
3. **Hypothesis: second POST to `/api/integrations/ai-settings/` creates a second row instead of updating.** `UserAISettings` uses `OneToOneField(user, ...)` + `perform_create` explicitly fetches and updates existing row. A second POST cannot produce a duplicate. **Not vulnerable.**
|
||||
|
||||
4. **Hypothesis: initializeDefaultAiSettings silently overwrites the saved DB provider with the first catalog provider if the saved provider is temporarily unavailable (e.g., API key deleted).** Confirmed: line 119–121 does silently auto-select first available provider and blank the model if the saved provider is gone. This affects display only (not DB); the save action is still explicit. **Acceptable behavior; low risk.**
|
||||
|
||||
5. **Hypothesis: frontend sends `model: undefined` (vs `model: null`) when no model selected, causing backend to ignore it.** `requested_model = (request.data.get("model") or "").strip() or None` — if `undefined`/absent from JSON body, `get("model")` returns `None`, which becomes `None` after the guard. `model` variable falls through to default logic. **Works correctly.**
|
||||
|
||||
**MUTATION_ESCAPES: 1/8** — the regex `(is|are) required` in `_is_required_param_tool_error` (chat-loop-hardening code) would escape if a future required-arg error used a different pattern, but this is unrelated to `default-ai-settings` scope.
|
||||
|
||||
**Zero automated test coverage for `UserAISettings` CRUD + precedence logic.** Backend logic is covered only by the lead's live-run evidence. Recommended follow-up: add Django TestCase covering (a) upsert idempotency, (b) provider/model precedence in `send_message`, (c) cross-provider model guard.
|
||||
|
||||
## Tester Validation — `chat-loop-hardening` (2026-03-09)
|
||||
|
||||
### STATUS: PASS
|
||||
|
||||
**Evidence from lead (runtime):** Authenticated POST to `send_message` with patched upstream stream emitting `search_places {}` (missing required `location`) returned status 200, SSE body `data: {"tool_calls": [...]}` → `data: {"error": "...", "error_category": "tool_validation_error"}` → `data: [DONE]`. Persisted DB state after that turn: only `('user', None, 'restaurants please')` + `('assistant', None, '')` — no invalid `role=tool` error row.
|
||||
|
||||
**Standard pass findings:**
|
||||
|
||||
- `_is_required_param_tool_error`: correctly matches `location is required`, `query is required`, `collection_id is required`, `collection_id, name, latitude, and longitude are required`, `latitude and longitude are required`. Does NOT match non-required-arg errors (`dates must be a non-empty list`, `Trip not found`, `Unknown tool: foo`, etc.). All 18 test cases pass.
|
||||
- `_is_required_param_tool_error_message_content`: correctly parses JSON-wrapped content from persisted DB rows and delegates to above. Handles non-JSON, non-dict JSON, and `error: null` safely. All 7 test cases pass.
|
||||
- Orphan trimming in `_build_llm_messages`: when assistant has `tool_calls=[A, B]` and B's persisted tool row contains a required-param error, the rebuilt `assistant.tool_calls` retains only `[A]` and tool B's row is filtered. Verified for both the multi-tool case and the single-tool (lead's runtime) scenario.
|
||||
- SSE stream terminates with `data: [DONE]` immediately after the `tool_validation_error` event — confirmed by code path at line 425–426 which `return`s the generator.
|
||||
- `MAX_TOOL_ITERATIONS = 10` correctly set; `tool_iterations` counter is incremented on each tool iteration (pre-existing bug confirmed fixed).
|
||||
- `_merge_tool_call_delta` handles `None`, `[]`, missing `index`, and malformed argument JSON without crash.
|
||||
- Full Django test suite: 24/30 pass; 6/30 fail (all pre-existing: 2 user email key errors + 4 geocoding API mock errors). Zero regressions introduced by this changeset.
|
||||
|
||||
**Adversarial pass findings:**
|
||||
|
||||
1. **Hypothesis: `get_weather` with empty `dates=[]` bypasses short-circuit and loops.** `get_weather` returns `{"error": "dates must be a non-empty list"}` which does NOT match the `is/are required` regex → not short-circuited. Falls through to `MAX_TOOL_ITERATIONS` guard (10 iterations max). **Known gap, mitigated by guard — confirmed matches reviewer WARNING.**
|
||||
|
||||
2. **Hypothesis: regex injection via crafted error text creates false-positive short-circuit.** Tested `'x is required; rm -rf /'` (semicolon breaks `fullmatch`), newline injection, Cyrillic lookalike. All return `False` correctly. **Not vulnerable.**
|
||||
|
||||
3. **Hypothesis: `assistant.tool_calls=[]` (empty list) pollutes rebuilt messages.** `filtered_tool_calls` is `[]` → the `if filtered_tool_calls:` guard prevents empty `tool_calls` key from being added to the payload. **Not vulnerable.**
|
||||
|
||||
4. **Hypothesis: `tool message content = None` is incorrectly classified as required-param error.** `_is_required_param_tool_error_message_content(None)` returns `False` (not a string → returns early). **Not vulnerable.**
|
||||
|
||||
5. **Hypothesis: `_build_required_param_error_event` crashes on None/missing `result`.** `result.get("error")` is guarded by `if isinstance(result, dict)` in caller; the static method itself handles `None` result via `isinstance` check and produces `error=""`. **No crash.**
|
||||
|
||||
6. **Hypothesis: multi-tool scenario — only partial `tool_calls` prefix trimmed correctly.** Tested assistant with `[A, B]` where A succeeds and B fails: rebuilt messages contain `tool_calls=[A]` only. Tested assistant with only `[X]` failing: rebuilt messages contain `tool_calls=None` (key absent). **Both correct.**
|
||||
|
||||
**MUTATION_ESCAPES: 1/7** — `get_weather` returning `"dates must be a non-empty list"` not triggering the short-circuit. This is a known, reviewed, accepted gap (mitigated by `MAX_TOOL_ITERATIONS`). No other mutation checks escaped detection.
|
||||
|
||||
**FLAKY: 0**
|
||||
|
||||
**COVERAGE: N/A** — no automated test suite exists for the `chat` app; all validation is via unit-level method tests + lead's live-run evidence. Recommended follow-up: add Django `TestCase` for `send_message` streaming loop covering (a) single required-arg tool failure → short-circuit, (b) multi-tool partial success, (c) `MAX_TOOL_ITERATIONS` exhaustion, (d) `_build_llm_messages` orphan-trimming round-trip.
|
||||
|
||||
## Tester Validation — `suggestion-add-flow` (2026-03-09)
|
||||
|
||||
### STATUS: PASS
|
||||
|
||||
**Test run:** 30 Django tests (24 pass, 6 fail — all 6 pre-existing: 2 user email key errors + 4 geocoding mock failures). Zero new regressions. 44 targeted unit-level checks (42 pass, 2 fail — both failures confirmed as test-script defects, not code bugs).
|
||||
|
||||
**Standard pass findings:**
|
||||
|
||||
- `_resolve_provider_and_model` precedence verified end-to-end: explicit request payload → `UserAISettings.preferred_provider/model` → `settings.VOYAGE_AI_PROVIDER/MODEL` → provider-config default. All 4 precedence levels tested and confirmed correct.
|
||||
- Cross-provider model guard confirmed: when request provider ≠ `preferred_provider`, the `preferred_model` is NOT applied (prevents `gpt-5-nano` from leaking to anthropic, etc.).
|
||||
- Null/empty `preferred_provider`/`preferred_model` in `UserAISettings` handled safely (`or ""` coercion guards throughout).
|
||||
- JSON parsing in `_get_suggestions_from_llm` is robust: handles clean JSON array, embedded JSON in prose, markdown-wrapped JSON, plain text (no JSON), empty string, `None` content — all return correct results (empty list or parsed list). Response capped at 5 items. Single-dict LLM response wrapped in list correctly.
|
||||
- `normalizeSuggestionItem` normalization verified: non-dict returns `null`, missing name+location returns `null`, field aliases (`title`→`name`, `address`→`location`, `summary`→`description`, `score`→`rating`, `whyFits`→`why_fits`) all work. Whitespace-only name falls back to location.
|
||||
- `rating=0` correctly preserved in TypeScript via `??` (nullish coalescing at line 171), not dropped. The Python port used `or` which drops `0`, but that's a test-script defect only.
|
||||
- `buildLocationPayload` constructs a valid `LocationSerializer`-compatible payload: `name`, `location`, `description`, `rating`, `collections`, `is_public`. Falls back to collection location when suggestion has none.
|
||||
- `handleAddSuggestion` → POST `/api/locations/` → `dispatch('addItem', {type:'location', itemId, updateDate:false})` wiring confirmed by code inspection (lines 274–294). Parent `CollectionItineraryPlanner` handler at line 2626 calls `addItineraryItemForObject`.
|
||||
|
||||
**Adversarial pass findings:**
|
||||
|
||||
1. **Hypothesis: cross-provider model leak (gpt-5-nano applied to anthropic).** Tested `request.provider=anthropic` + `UserAISettings.preferred_provider=opencode_zen`, `preferred_model=gpt-5-nano`. Result: `model_from_user_defaults=None` (because `provider != preferred_provider`). **Not vulnerable.**
|
||||
|
||||
2. **Hypothesis: null/empty DB prefs cause exceptions.** `preferred_provider=None`, `preferred_model=None` — all guards use `(value or "").strip()` pattern. Falls through to `settings.VOYAGE_AI_PROVIDER` safely. **Not vulnerable.**
|
||||
|
||||
3. **Hypothesis: all-None provider/model/settings causes exception in `_resolve_provider_and_model`.** Tested with `is_chat_provider_available=False` everywhere, all settings None. Returns `(None, None)` without exception; caller checks `is_chat_provider_available(provider)` and returns 503. **Not vulnerable.**
|
||||
|
||||
4. **Hypothesis: missing API key causes silent empty result instead of error.** `get_llm_api_key` returns `None` → raises `ValueError("No API key available")` → caught by `post()` try/except → returns 500. **Explicit error path confirmed.**
|
||||
|
||||
5. **Hypothesis: no model configured causes silent failure.** `model=None` + empty `provider_config` → raises `ValueError("No model configured for provider")` → 500. **Explicit error path confirmed.**
|
||||
|
||||
6. **Hypothesis: `normalizeSuggestionItem` with mixed array (nulls, strings, invalid dicts).** `[None, {name:'A'}, 'string', {description:'only'}, {name:'B'}]` → after normalize+filter: 2 valid items. **Correct.**
|
||||
|
||||
7. **Hypothesis: rating=0 dropped by falsy check.** Actual TS uses `item.rating ?? item.score` (nullish coalescing, not `||`). `normalizeRating(0)` returns `0` (finite number check). **Not vulnerable in actual code.**
|
||||
|
||||
8. **Hypothesis: XSS in name field.** `<script>alert(1)</script>` passes through as a string; Django serializer stores as text, template rendering escapes it. **Not vulnerable.**
|
||||
|
||||
9. **Hypothesis: double-click `handleAddSuggestion` creates duplicate location.** `isAdding` guard at line 266 exits early if `isAdding` is truthy — prevents re-entrancy. **Protected by UI-state guard.**
|
||||
|
||||
**Known low-severity defect (pre-existing, not introduced by this workstream):** LLM-generated `name`/`location` fields are not truncated before passing to `LocationSerializer` (max_length=200). If LLM returns a name > 200 chars, the POST to `/api/locations/` returns 400 and the frontend shows a generic error. Risk is very low in practice (LLM names are short). Recommended fix: add `.slice(0, 200)` in `buildLocationPayload` for `name` and `location` fields.
|
||||
|
||||
**MUTATION_ESCAPES: 1/9** — `rating=0` would escape mutation detection in naive Python tests (but is correctly handled in the actual TS `??` code). No logic mutations escape in the backend Python code.
|
||||
|
||||
**FLAKY: 0**
|
||||
|
||||
**COVERAGE: N/A** — no automated suite for `chat` or `suggestions` app. All validation via unit-level method tests + provider/model resolution checks. Recommended follow-up: add Django `TestCase` for `DaySuggestionsView.post()` covering (a) missing required fields → 400, (b) invalid category → 400, (c) unauthorized collection → 403, (d) provider unavailable → 503, (e) LLM exception → 500, (f) happy path → 200 with `suggestions` array.
|
||||
|
||||
**Cleanup required:** Two test artifact files left on host (not git-tracked, safe to delete):
|
||||
- `/home/alex/projects/voyage/test_suggestion_flow.py`
|
||||
- `/home/alex/projects/voyage/suggestion-modal-error-state.png`
|
||||
401
.memory/plans/opencode-zen-connection-error.md
Normal file
401
.memory/plans/opencode-zen-connection-error.md
Normal file
@@ -0,0 +1,401 @@
|
||||
# Plan: Fix OpenCode Zen connection errors in AI travel chat
|
||||
|
||||
## Clarified requirements
|
||||
- User configured provider `opencode_zen` in Settings with API key.
|
||||
- Chat attempts return a generic connection error.
|
||||
- Goal: identify root cause and implement a reliable fix for OpenCode Zen chat connectivity.
|
||||
- Follow-up: add model selection in chat composer (instead of forced default model) and persist chosen model per user.
|
||||
|
||||
## Acceptance criteria
|
||||
- Sending a chat message with provider `opencode_zen` no longer fails with a connection error due to Voyage integration/configuration.
|
||||
- Backend provider routing for `opencode_zen` uses a validated OpenAI-compatible request shape and model format.
|
||||
- Frontend surfaces backend/provider errors with actionable detail (not only generic connection failure) when available.
|
||||
- Validation commands run successfully (or known project-expected failures only) and results recorded.
|
||||
|
||||
## Tasks
|
||||
- [ ] Discovery: inspect current OpenCode Zen provider configuration and chat request pipeline (Agent: explorer)
|
||||
- [ ] Discovery: verify OpenCode Zen API compatibility requirements vs current implementation (Agent: researcher)
|
||||
- [ ] Discovery: map model-selection edit points and persistence path (Agent: explorer)
|
||||
- [x] Implement fix for root cause + model selection/persistence (Agent: coder)
|
||||
- [x] Correctness review of targeted changes (Agent: reviewer) — APPROVED (score 0)
|
||||
- [x] Standard validation run and targeted chat-path checks (Agent: tester)
|
||||
- [x] Documentation and knowledge sync for provider troubleshooting notes (Agent: librarian)
|
||||
|
||||
## Researcher findings
|
||||
|
||||
**Root cause**: Two mismatches in `backend/server/chat/llm_client.py` lines 59-64:
|
||||
|
||||
1. **Invalid model ID** — `default_model: "openai/gpt-4o-mini"` does not exist on OpenCode Zen. Zen has its own model catalog (gpt-5-nano, glm-5, kimi-k2.5, etc.). Sending `gpt-4o-mini` to the Zen API results in a model-not-found error.
|
||||
2. **Endpoint routing** — GPT models on Zen use `/responses` endpoint, but LiteLLM's `openai/` prefix routes through the OpenAI Python client which appends `/chat/completions`. The `/chat/completions` endpoint only works for OpenAI-compatible models (GLM, Kimi, MiniMax, Qwen, Big Pickle).
|
||||
|
||||
**Error flow**: LiteLLM exception → caught by generic handler at line 274 → yields `"An error occurred while processing your request"` SSE → frontend shows either this message or falls back to `$t('chat.connection_error')`.
|
||||
|
||||
**Recommended fix** (primary — `llm_client.py:62`):
|
||||
- Change `"default_model": "openai/gpt-4o-mini"` → `"openai/gpt-5-nano"` (free model, confirmed to work via `/chat/completions` by real-world usage in multiple repos)
|
||||
|
||||
**Secondary fix** (error surfacing — `llm_client.py:274-276`):
|
||||
- Extract meaningful error info from LiteLLM exceptions (status_code, message) instead of swallowing all details into a generic message
|
||||
|
||||
Full analysis: [research/opencode-zen-connection-debug.md](../research/opencode-zen-connection-debug.md)
|
||||
|
||||
## Retry tracker
|
||||
- OpenCode Zen connection fix task: 0
|
||||
|
||||
## Implementation checkpoint (coder)
|
||||
|
||||
- Added composer-level model selection + per-provider browser persistence in `frontend/src/lib/components/AITravelChat.svelte` using localStorage key `voyage_chat_model_prefs`.
|
||||
- Added `chat.model_label` and `chat.model_placeholder` i18n keys in `frontend/src/locales/en.json`.
|
||||
- Extended `send_message` backend intake in `backend/server/chat/views.py` to read optional `model` (`empty -> None`) and pass it to streaming.
|
||||
- Updated `backend/server/chat/llm_client.py` to:
|
||||
- switch `opencode_zen` default model to `openai/gpt-5-nano`,
|
||||
- accept optional `model` override in `stream_chat_completion(...)`,
|
||||
- apply safe provider/model compatibility guard (skip strict prefix check for custom `api_base` gateways),
|
||||
- map known LiteLLM exception classes to sanitized user-safe error categories/messages,
|
||||
- include `tools` / `tool_choice` kwargs only when tools are present.
|
||||
|
||||
See related analysis in [research notes](../research/opencode-zen-connection-debug.md#model-selection-implementation-map).
|
||||
|
||||
---
|
||||
|
||||
## Explorer findings (model selection)
|
||||
|
||||
**Date**: 2026-03-08
|
||||
**Full detail**: [research/opencode-zen-connection-debug.md — Model selection section](../research/opencode-zen-connection-debug.md#model-selection-implementation-map)
|
||||
|
||||
### Persistence decision: `localStorage` (no migration)
|
||||
|
||||
**Recommended**: store `{ [provider_id]: model_string }` in `localStorage` key `voyage_chat_model_prefs`.
|
||||
|
||||
Rationale:
|
||||
- No existing per-user model preference field anywhere in DB/API
|
||||
- Adding a DB column to `CustomUser` requires a migration + serializer + API change → 4+ files
|
||||
- `UserAPIKey` stores only encrypted API keys (not preferences)
|
||||
- Model preference is UI-volatile (the model catalog changes; stale DB entries require cleanup)
|
||||
- `localStorage` is already used elsewhere in the frontend for similar ephemeral UI state
|
||||
- Model preference is not sensitive; persisting client-side is consistent with how the provider selector already works (no backend persistence either)
|
||||
- **No migration required** for localStorage approach
|
||||
|
||||
### File-by-file edit plan (exact symbols)
|
||||
|
||||
#### Backend: `backend/server/chat/llm_client.py`
|
||||
- `stream_chat_completion(user, messages, provider, tools=None)` → add `model: str | None = None` parameter
|
||||
- Line 226: `"model": provider_config["default_model"]` → `"model": model or provider_config["default_model"]`
|
||||
- Add validation: if `model` is not `None`, check it starts with a valid LiteLLM provider prefix (or matches a known-safe pattern); reject bare model strings that don't include provider prefix
|
||||
|
||||
#### Backend: `backend/server/chat/views.py`
|
||||
- `send_message()` (line 104): extract `model = (request.data.get("model") or "").strip() or None`
|
||||
- Pass `model=model` to `stream_chat_completion()` call (line 144)
|
||||
- Add validation: if `model` is provided, confirm it belongs to the same provider family (prefix check); return 400 if mismatch
|
||||
|
||||
#### Frontend: `frontend/src/lib/types.ts`
|
||||
- No change needed — `ChatProviderCatalogEntry.default_model` already exists
|
||||
|
||||
#### Frontend: `frontend/src/lib/components/AITravelChat.svelte`
|
||||
- Add `let selectedModel: string = ''` (reset when provider changes)
|
||||
- Add reactive: `$: selectedProviderEntry = chatProviders.find(p => p.id === selectedProvider) ?? null`
|
||||
- Add reactive: `$: { if (selectedProviderEntry) { selectedModel = loadModelPref(selectedProvider) || selectedProviderEntry.default_model || ''; } }`
|
||||
- `sendMessage()` line 121: body `{ message: msgText, provider: selectedProvider }` → `{ message: msgText, provider: selectedProvider, model: selectedModel }`
|
||||
- Add model input field in the composer toolbar (near provider `<select>`, line 290-299): `<input type="text" class="input input-bordered input-sm" bind:value={selectedModel} placeholder={selectedProviderEntry?.default_model ?? ''} />`
|
||||
- Add `loadModelPref(provider)` / `saveModelPref(provider, model)` functions using `localStorage` key `voyage_chat_model_prefs`
|
||||
- Add `$: saveModelPref(selectedProvider, selectedModel)` reactive to persist on change
|
||||
|
||||
#### Frontend: `frontend/src/locales/en.json`
|
||||
- Add `"chat.model_label"`: `"Model"` (label for model input)
|
||||
- Add `"chat.model_placeholder"`: `"Default model"` (placeholder when empty)
|
||||
|
||||
### Validation constraints / risks
|
||||
|
||||
1. **Model-provider prefix mismatch**: `stream_chat_completion` uses `provider_config["default_model"]` prefix to route via LiteLLM. If user passes `openai/gpt-5-nano` for the `anthropic` provider, LiteLLM will try to call OpenAI with Anthropic credentials. Backend must validate that the supplied model string starts with the expected provider prefix or reject it.
|
||||
2. **Free-text model field**: No enumeration from backend; user types any string. Validation (prefix check) is the only guard.
|
||||
3. **localStorage staleness**: If a provider removes a model, the stored preference produces a LiteLLM error — the error surfacing fix (Fix #2 in existing plan) makes this diagnosable.
|
||||
4. **Empty string vs null**: Frontend should send `model: selectedModel || undefined` (omit key if empty) to preserve backend default behavior.
|
||||
|
||||
### No migration required
|
||||
All backend changes are parameter additions to existing function signatures + optional request field parsing. No DB schema changes.
|
||||
|
||||
---
|
||||
|
||||
## Explorer findings
|
||||
|
||||
**Date**: 2026-03-08
|
||||
**Detail**: Full trace in [research/opencode-zen-connection-debug.md](../research/opencode-zen-connection-debug.md)
|
||||
|
||||
### End-to-end path (summary)
|
||||
|
||||
```
|
||||
AITravelChat.svelte:sendMessage()
|
||||
POST /api/chat/conversations/<id>/send_message/ { message, provider:"opencode_zen" }
|
||||
→ +server.ts:handleRequest() [CSRF refresh + proxy, SSE passthrough lines 94-98]
|
||||
→ views.py:ChatViewSet.send_message() [validates provider, saves user msg]
|
||||
→ llm_client.py:stream_chat_completion() [builds kwargs, calls litellm.acompletion]
|
||||
→ litellm.acompletion(model="openai/gpt-4o-mini", api_base="https://opencode.ai/zen/v1")
|
||||
→ POST https://opencode.ai/zen/v1/chat/completions ← FAILS: model not on Zen
|
||||
→ except Exception at line 274 → data:{"error":"An error occurred..."}
|
||||
← frontend shows error string inline (or "Connection error." on network failure)
|
||||
```
|
||||
|
||||
### Ranked root causes confirmed by code trace
|
||||
|
||||
1. **[CRITICAL] Wrong default model** (`openai/gpt-4o-mini` is not a Zen model)
|
||||
- `backend/server/chat/llm_client.py:62`
|
||||
- Fix: change to `"openai/gpt-5-nano"` (free, confirmed OpenAI-compat via `/chat/completions`)
|
||||
|
||||
2. **[SIGNIFICANT] Generic exception handler masks provider errors**
|
||||
- `backend/server/chat/llm_client.py:274-276`
|
||||
- Bare `except Exception:` swallows LiteLLM structured exceptions (NotFoundError, AuthenticationError, etc.)
|
||||
- Fix: extract `exc.status_code` / `exc.message` and forward to SSE error payload
|
||||
|
||||
3. **[SIGNIFICANT] WSGI + per-request event loop for async LiteLLM**
|
||||
- Backend runs **Gunicorn WSGI** (`supervisord.conf:11`); no ASGI entry point exists
|
||||
- `views.py:66-76` `_async_to_sync_generator` creates `asyncio.new_event_loop()` per request
|
||||
- LiteLLM httpx sessions may not be compatible with per-call new loops → potential connection errors on the second+ tool iteration
|
||||
- Fix: wrap via `asyncio.run()` or migrate to ASGI (uvicorn)
|
||||
|
||||
4. **[MINOR] `tool_choice: None` / `tools: None` passed as kwargs when unused**
|
||||
- `backend/server/chat/llm_client.py:227-229`
|
||||
- Fix: conditionally include keys only when tools are present
|
||||
|
||||
5. **[MINOR] Synchronous ORM call inside async generator**
|
||||
- `backend/server/chat/llm_client.py:217` — `get_llm_api_key()` calls `UserAPIKey.objects.get()` synchronously
|
||||
- Fine under WSGI+new-event-loop but technically incorrect for async context
|
||||
- Fix: wrap with `sync_to_async` or move key lookup before entering async boundary
|
||||
|
||||
### Minimal edit points for a fix
|
||||
|
||||
| Priority | File | Location | Change |
|
||||
|---|---|---|---|
|
||||
| 1 (required) | `backend/server/chat/llm_client.py` | line 62 | `"default_model": "openai/gpt-5-nano"` |
|
||||
| 2 (recommended) | `backend/server/chat/llm_client.py` | lines 274-276 | Extract `exc.status_code`/`exc.message` for user-facing error |
|
||||
| 3 (recommended) | `backend/server/chat/llm_client.py` | lines 225-234 | Only include `tools`/`tool_choice` keys when tools are provided |
|
||||
|
||||
---
|
||||
|
||||
## Critic gate
|
||||
|
||||
**VERDICT**: APPROVED
|
||||
**Date**: 2026-03-08
|
||||
**Reviewer**: critic agent
|
||||
|
||||
### Rationale
|
||||
|
||||
The plan is well-scoped, targets a verified root cause with clear code references, and all three changes are in a single file (`llm_client.py`) within the same request path. This is a single coherent bug fix, not a multi-feature plan — no decomposition required.
|
||||
|
||||
### Assumption challenges
|
||||
|
||||
1. **`gpt-5-nano` validity on Zen** — The researcher claims this model is confirmed via GitHub usage patterns, but there is no live API verification. The risk is mitigated by Fix #2 (error surfacing), which would make any remaining model mismatch immediately diagnosable. **Accepted with guardrail**: coder must add a code comment noting the model was chosen based on research, and tester must verify the error path produces a meaningful message if the model is still wrong.
|
||||
|
||||
2. **`@mdi/js` build failure is NOT a baseline issue** — `@mdi/js` is a declared dependency in `package.json:44` but `node_modules/` is absent in this worktree. Running `bun install` will resolve this. **Guardrail**: Coder must run `bun install` before the validation pipeline; do not treat this as a known/accepted failure.
|
||||
|
||||
3. **Error surfacing may leak sensitive info** — Forwarding raw `exc.message` from LiteLLM exceptions could expose `api_base` URLs, internal config, or partial request data. Prior security review (decisions.md:103) already flagged `api_base` leakage as unnecessary. **Guardrail**: The error surfacing fix must sanitize exception messages — use only `exc.status_code` and a generic category (e.g., "authentication error", "model not found", "rate limit exceeded"), NOT raw `exc.message`. Map known LiteLLM exception types to safe user-facing descriptions.
|
||||
|
||||
### Scope guardrails for implementation
|
||||
|
||||
1. **In scope**: Fixes #1, #2, #3 from the plan table (model name, error surfacing, tool_choice cleanup) — all in `backend/server/chat/llm_client.py`.
|
||||
2. **Out of scope**: Fix #3 from Explorer findings (WSGI→ASGI migration), Fix #5 (sync_to_async ORM). These are structural improvements, not root cause fixes.
|
||||
3. **No frontend changes** unless the error message format changes require corresponding updates to `AITravelChat.svelte` parsing — verify and include only if needed.
|
||||
4. **Error surfacing must sanitize**: Map LiteLLM exception classes (`NotFoundError`, `AuthenticationError`, `RateLimitError`, `BadRequestError`) to safe user-facing categories. Do NOT forward raw `exc.message` or `str(exc)`.
|
||||
5. **Validation**: Run `bun install` first, then full pre-commit checklist (`format`, `lint`, `check`, `build`). Backend `manage.py check` must pass. If possible, manually test the chat SSE error path with a deliberately bad model name to confirm error surfacing works.
|
||||
6. **No new dependencies, no migrations, no schema changes** — none expected and none permitted for this fix.
|
||||
|
||||
---
|
||||
|
||||
## Reviewer security verdict
|
||||
|
||||
**VERDICT**: APPROVED
|
||||
**LENS**: Security
|
||||
**REVIEW_SCORE**: 3
|
||||
**Date**: 2026-03-08
|
||||
|
||||
### Security goals evaluated
|
||||
|
||||
| Goal | Status | Evidence |
|
||||
|---|---|---|
|
||||
| 1. Error handling doesn't leak secrets/api_base/raw internals | ✅ PASS | `_safe_error_payload()` maps exception classes to hardcoded user-safe strings; no `str(exc)`, `exc.message`, or `exc.args` forwarded. Logger.exception at line 366 is server-side only. Critic guardrail (decisions.md:189) fully satisfied. |
|
||||
| 2. Model override input can't bypass provider constraints dangerously | ✅ PASS | Model string used only as JSON field in `litellm.acompletion()` kwargs. No SQL, no shell, no eval, no path traversal. `_is_model_override_compatible()` validates prefix for standard providers. Gateway providers (`api_base` set) skip prefix check — correct by design, worst case is provider returns an error caught by sanitized handler. |
|
||||
| 3. No auth/permission regressions in send_message | ✅ PASS | `IsAuthenticated` + `get_queryset(user=self.request.user)` unchanged. New `model` param is additive-only, doesn't bypass existing validation. Tool execution scopes all DB queries to `user=user`. |
|
||||
| 4. localStorage stores no sensitive values | ✅ PASS | Key `voyage_chat_model_prefs` stores `{provider_id: model_string}` only. SSR-safe guards present. Try/catch on JSON parse/write. |
|
||||
|
||||
### Findings
|
||||
|
||||
**CRITICAL**: (none)
|
||||
|
||||
**WARNINGS**:
|
||||
- `[llm_client.py:194,225]` `api_base` field exposed in provider catalog response to frontend — pre-existing from prior consolidated review (decisions.md:103), not newly introduced. Server-defined constants only (not user-controllable), no SSRF. Frontend type includes field but never renders or uses it. (confidence: MEDIUM)
|
||||
|
||||
**SUGGESTIONS**:
|
||||
1. Consider adding a `max_length` check on the `model` parameter in `views.py:114` (e.g., reject if >200 chars) as defense-in-depth against pathological inputs, though Django's request size limits provide a baseline guard.
|
||||
2. Consider omitting `api_base` from the provider catalog response to frontend since the frontend never uses this value (pre-existing — tracked since prior security review).
|
||||
|
||||
### Prior findings cross-check
|
||||
- **Critic guardrail** (decisions.md:119-123 — "Error surfacing must NOT forward raw exc.message"): **CONFIRMED** — implementation uses class-based dispatch to hardcoded strings.
|
||||
- **Prior security review** (decisions.md:98-115 — api_base exposure, provider validation, IDOR checks): **CONFIRMED** — all findings still valid, no regressions.
|
||||
- **Explorer model-provider prefix mismatch warning** (plan lines 108-109): **CONFIRMED** — `_is_model_override_compatible()` implements the recommended validation.
|
||||
|
||||
### Tracker states
|
||||
- [x] Security goal 1: sanitized error handling (PASS)
|
||||
- [x] Security goal 2: model override safety (PASS)
|
||||
- [x] Security goal 3: auth/permission integrity (PASS)
|
||||
- [x] Security goal 4: localStorage safety (PASS)
|
||||
|
||||
---
|
||||
|
||||
## Reviewer correctness verdict
|
||||
|
||||
**VERDICT**: APPROVED
|
||||
**LENS**: Correctness
|
||||
**REVIEW_SCORE**: 0
|
||||
**Date**: 2026-03-08
|
||||
|
||||
### Requirements verification
|
||||
|
||||
| Requirement | Status | Evidence |
|
||||
|---|---|---|
|
||||
| Chat composer model selection | ✅ PASS | `AITravelChat.svelte:346-353` — text input bound to `selectedModel`, placed in composer header next to provider selector. Disabled when no providers available. |
|
||||
| Per-provider browser persistence | ✅ PASS | `loadModelPref`/`saveModelPref` (lines 60-92) use `localStorage` key `voyage_chat_model_prefs`. Provider change loads saved preference via `initializedModelProvider` sentinel (lines 94-98). User edits auto-save via reactive block (lines 100-102). JSON parse errors caught. SSR guards present. |
|
||||
| Optional model passed to backend | ✅ PASS | Frontend sends `model: selectedModel.trim() || undefined` (line 173). Backend extracts `model = (request.data.get("model") or "").strip() or None` (views.py:114). Passed as `model=model` to `stream_chat_completion` (views.py:150). |
|
||||
| Model used as override in backend | ✅ PASS | `completion_kwargs["model"] = model or provider_config["default_model"]` (llm_client.py:316). Null/empty correctly falls back to provider default. |
|
||||
| No regressions in provider selection/send flow | ✅ PASS | Provider selection, validation, SSE streaming all unchanged except additive `model` param. Error field format compatible with existing frontend parsing (`parsed.error` at line 210). |
|
||||
| Error category mapping coherent with frontend | ✅ PASS | Backend `_safe_error_payload` returns `{"error": "...", "error_category": "..."}`. Frontend checks `parsed.error` (human-readable string) and displays it. `error_category` available for future programmatic use. HTTP 400 errors also use `err.error` pattern (lines 177-183). |
|
||||
|
||||
### Correctness checklist
|
||||
|
||||
- **Off-by-one**: N/A — no index arithmetic in changes.
|
||||
- **Null/undefined dereference**: `selectedProviderEntry?.default_model ?? ''` and `|| $t(...)` — null-safe. Backend `model or provider_config["default_model"]` — None-safe.
|
||||
- **Ignored errors**: `try/catch` in `loadModelPref`/`saveModelPref` returns safe defaults. Backend exception handler maps to user-facing messages.
|
||||
- **Boolean logic**: Reactive guard `initializedModelProvider !== selectedProvider` correctly gates initialization vs save paths.
|
||||
- **Async/await**: No new async code in frontend. Backend `model` param is synchronous extraction before async boundary.
|
||||
- **Race conditions**: None introduced — `selectedModel` is single-threaded Svelte state.
|
||||
- **Resource leaks**: None — localStorage access is synchronous and stateless.
|
||||
- **Unsafe defaults**: Model defaults to provider's `default_model` when empty — safe.
|
||||
- **Dead/unreachable branches**: Pre-existing `tool_iterations` (views.py:139-141, never incremented) — not introduced by this change.
|
||||
- **Contract violations**: Function signature `stream_chat_completion(user, messages, provider, tools=None, model=None)` matches all call sites. `_is_model_override_compatible` return type is bool, used correctly in conditional.
|
||||
- **Reactive loop risk**: Verified — `initializedModelProvider` sentinel prevents re-entry between Block 1 (load) and Block 2 (save). `saveModelPref` has no state mutations → no cascading reactivity.
|
||||
|
||||
### Findings
|
||||
|
||||
**CRITICAL**: (none)
|
||||
**WARNINGS**: (none)
|
||||
|
||||
**SUGGESTIONS**:
|
||||
1. `[AITravelChat.svelte:100-102]` Save-on-every-keystroke reactive block calls `saveModelPref` on each character typed. Consider debouncing or saving on blur/submit to reduce localStorage churn.
|
||||
2. `[llm_client.py:107]` `getattr(exceptions, "NotFoundError", tuple())` — `isinstance(exc, ())` is always False by design (graceful fallback). A brief inline comment would clarify intent for future readers.
|
||||
|
||||
### Prior findings cross-check
|
||||
- **Critic gate guardrails** (decisions.md:117-124): All 3 guardrails confirmed followed (sanitized errors, `bun install` prerequisite, WSGI migration out of scope).
|
||||
- **`opencode_zen` default model**: Changed from `openai/gpt-4o-mini` → `openai/gpt-5-nano` as prescribed by researcher findings.
|
||||
- **`api_base` catalog exposure** (decisions.md:103): Pre-existing, unchanged by this change.
|
||||
- **`tool_iterations` dead guard** (decisions.md:91): Pre-existing, not affected by this change.
|
||||
|
||||
### Tracker states
|
||||
- [x] Correctness goal 1: model selection end-to-end (PASS)
|
||||
- [x] Correctness goal 2: per-provider persistence (PASS)
|
||||
- [x] Correctness goal 3: model override to backend (PASS)
|
||||
- [x] Correctness goal 4: no provider/send regressions (PASS)
|
||||
- [x] Correctness goal 5: error mapping coherence (PASS)
|
||||
|
||||
---
|
||||
|
||||
## Tester verdict (standard + adversarial)
|
||||
|
||||
**STATUS**: PASS
|
||||
**PASS**: Both (Standard + Adversarial)
|
||||
**Date**: 2026-03-08
|
||||
|
||||
### Commands run
|
||||
|
||||
| Command | Result |
|
||||
|---|---|
|
||||
| `docker compose exec server python3 manage.py check` | PASS — 0 issues (1 silenced, expected) |
|
||||
| `bun run check` (frontend) | PASS — 0 errors, 6 warnings (all pre-existing in `CollectionRecommendationView.svelte` + `RegionCard.svelte`, not in changed files) |
|
||||
| `docker compose exec server python3 manage.py test --keepdb` | 30 tests found; pre-existing failures: 2 user tests (email field key error) + 4 geocoding tests (Google API mock) = 6 failures (matches documented "2/3 fail" baseline). No regressions. |
|
||||
| Chat module static path validation (Django context) | PASS — all 5 targeted checks |
|
||||
| `bun run build` | Vite compilation PASS (534 modules SSR, 728 client). EACCES error on `build/` dir is a pre-existing Docker worktree permission issue, not a compilation failure. |
|
||||
|
||||
### Targeted checks verified
|
||||
|
||||
- [x] `opencode_zen` default model is `openai/gpt-5-nano` — **CONFIRMED**
|
||||
- [x] `stream_chat_completion` accepts `model: str | None = None` parameter — **CONFIRMED**
|
||||
- [x] Empty/whitespace/falsy `model` values in `views.py` produce `None` (falls back to provider default) — **CONFIRMED**
|
||||
- [x] `_safe_error_payload` does NOT leak raw exception text, `api_base`, or sensitive data — **CONFIRMED** (all 6 LiteLLM exception classes mapped to sanitized hardcoded strings)
|
||||
- [x] `_is_model_override_compatible` skips prefix check for `api_base` gateways — **CONFIRMED**
|
||||
- [x] Standard providers reject cross-provider model prefixes — **CONFIRMED**
|
||||
- [x] `is_chat_provider_available` rejects null, empty, and adversarial provider IDs — **CONFIRMED**
|
||||
- [x] i18n keys `chat.model_label` and `chat.model_placeholder` present in `en.json` — **CONFIRMED**
|
||||
- [x] `tools`/`tool_choice` kwargs excluded from `completion_kwargs` when `tools` is falsy — **CONFIRMED**
|
||||
|
||||
### Adversarial attempts
|
||||
|
||||
| Hypothesis | Test | Expected failure signal | Observed result |
|
||||
|---|---|---|---|
|
||||
| 1. Pathological model strings (long/unicode/injection/null-byte) crash `_is_model_override_compatible` | 500-char model, unicode model, SQL injection model, null-byte model | Exception or incorrect behavior | PASS — no crashes, all return True/False correctly |
|
||||
| 2. LiteLLM exception classes with sensitive data in `message` field leak via `_safe_error_payload` | All 6 LiteLLM exception classes instantiated with sensitive marker string | Sensitive data in SSE payload | PASS — all 6 classes return sanitized hardcoded payloads |
|
||||
| 3. Empty/whitespace/falsy model string bypasses `None` conversion in `views.py` | `""`, `" "`, `None`, `False`, `0` passed to views.py extraction | Model sent as empty string to LiteLLM | PASS — all produce `None`, triggering default fallback |
|
||||
| 4. All CHAT_PROVIDER_CONFIG providers have `default_model=None` (would cause `model=None` to LiteLLM) | Check each provider's `default_model` value | At least one None | PASS — all 9 providers have non-null `default_model` |
|
||||
| 5. Unknown provider without slash in `default_model` causes unintended prefix extraction | Provider not in `PROVIDER_MODEL_PREFIX` + bare `default_model` | Cross-prefix model rejected | PASS — no expected_prefix extracted from bare default → pass-through |
|
||||
| 6. Adversarial provider IDs (`__proto__`, null-byte, SQL injection, path traversal) bypass availability check | Injected strings to `is_chat_provider_available` | Available=True for injected ID | PASS — all rejected. Note: `openai\n` returns True because `strip()` normalizes to `openai` (correct, consistent with views.py normalization). |
|
||||
| 7. `_merge_tool_call_delta` with `None`, empty list, missing `index` key | Edge case inputs | Crash or wrong accumulator state | PASS — None/empty are no-ops; missing index defaults to 0 |
|
||||
| 7b. Large index (9999) to `_merge_tool_call_delta` causes DoS via huge list allocation | `index=9999` | Memory spike | NOTE (pre-existing, not in scope) — creates 10000-entry accumulator; pre-existing behavior |
|
||||
| 8. model fallback uses `and` instead of `or` | Verify `model or default` not `model and default` | Wrong model when set | PASS — `model or default` correctly preserves explicit model |
|
||||
| 9. `tools=None` causes None kwargs to LiteLLM | Verify conditional exclusion | `tool_choice=None` in kwargs | PASS — `if tools:` guard correctly excludes both kwargs when None |
|
||||
|
||||
### Mutation checks
|
||||
|
||||
| Mutation | Critical logic | Detected by tests? |
|
||||
|---|---|---|
|
||||
| `_is_model_override_compatible`: `not model OR api_base` → `not model AND api_base` | Gateway bypass | DETECTED — test covers api_base set + model set case |
|
||||
| `_merge_tool_call_delta`: `len(acc) <= idx` → `len(acc) < idx` | Off-by-one in accumulator growth | DETECTED — index=0 on empty list tested |
|
||||
| `completion_kwargs["model"]`: `model or default` → `model and default` | Model fallback | DETECTED — both None and set-model cases tested |
|
||||
| `is_chat_provider_available` negation | Provider validation gate | DETECTED — True and False cases both verified |
|
||||
| `_safe_error_payload` exception dispatch order | Error sanitization | DETECTED — LiteLLM exception MRO verified, no problematic inheritance |
|
||||
|
||||
**MUTATION_ESCAPES: 0/5**
|
||||
|
||||
### Findings
|
||||
|
||||
**CRITICAL**: (none)
|
||||
|
||||
**WARNINGS** (pre-existing, not introduced by this change):
|
||||
- `_merge_tool_call_delta` large index: no upper bound on accumulator size (pre-existing DoS surface; not in scope per critic gate)
|
||||
- `tool_iterations` never incremented (pre-existing dead guard; not in scope)
|
||||
|
||||
**SUGGESTIONS** (carry-forward from reviewer):
|
||||
1. Debounce `saveModelPref` on model input (every-keystroke localStorage writes)
|
||||
2. Add clarifying comment on `getattr(exceptions, "NotFoundError", tuple())` fallback pattern
|
||||
|
||||
### Task tracker update
|
||||
- [x] Standard validation run and targeted chat-path checks (Agent: tester) — PASS
|
||||
- [x] Documentation and knowledge sync for provider troubleshooting notes (Agent: librarian) — COMPLETE
|
||||
|
||||
---
|
||||
|
||||
## Librarian coverage verdict
|
||||
|
||||
**STATUS**: COMPLETE
|
||||
**Date**: 2026-03-08
|
||||
|
||||
### Files updated
|
||||
|
||||
| File | Changes | Reason |
|
||||
|---|---|---|
|
||||
| `README.md` | Added model selection, error handling, and `gpt-5-nano` default to AI Chat section | User-facing docs now reflect model override and error surfacing features |
|
||||
| `docs/docs/usage/usage.md` | Added model override and error messaging to AI Travel Chat section | Usage guide now covers model input and error behavior |
|
||||
| `.memory/knowledge.md` | Added 3 new sections: Chat Model Override Pattern, Sanitized LLM Error Mapping, OpenCode Zen Provider. Updated AI Chat section with model override + error mapping refs. Updated known issues baseline (0 errors/6 warnings, 6/30 test failures). | Canonical project knowledge now covers all new patterns for future sessions |
|
||||
| `AGENTS.md` | Added model override + error surfacing to AI chat description and Key Patterns. Updated known issues baseline. | OpenCode instruction file synced |
|
||||
| `CLAUDE.md` | Same changes as AGENTS.md (AI chat description, key patterns, known issues) | Claude Code instruction file synced |
|
||||
| `.github/copilot-instructions.md` | Added model override + error surfacing to AI Chat description. Updated known issues + command output baselines. | Copilot instruction file synced |
|
||||
| `.cursorrules` | Updated known issues baseline. Added chat model override + error surfacing conventions. | Cursor instruction file synced |
|
||||
|
||||
### Knowledge propagation
|
||||
|
||||
- **Inward merge**: No new knowledge found in instruction files that wasn't already in `.memory/`. All instruction files were behind `.memory/` state.
|
||||
- **Outward sync**: All 4 instruction files updated with: (1) model override pattern, (2) sanitized error mapping, (3) `opencode_zen` default model `openai/gpt-5-nano`, (4) corrected known issues baseline.
|
||||
- **Cross-references**: knowledge.md links to plan file for model selection details and to decisions.md for critic gate guardrail. New sections cross-reference each other (error mapping → decisions.md, model override → plan).
|
||||
|
||||
### Not updated (out of scope)
|
||||
|
||||
- `docs/architecture.md` — Stub file; model override is an implementation detail, not architectural. The chat app entry already exists.
|
||||
- `docs/docs/guides/travel_agent.md` — MCP endpoint docs; unrelated to in-app chat model selection.
|
||||
- `docs/docs/configuration/advanced_configuration.md` — Chat uses per-user API keys (no server-side env vars); no config changes to document.
|
||||
|
||||
### Task tracker
|
||||
- [x] Documentation and knowledge sync for provider troubleshooting notes (Agent: librarian)
|
||||
36
.memory/plans/pre-release-and-memory-migration.md
Normal file
36
.memory/plans/pre-release-and-memory-migration.md
Normal file
@@ -0,0 +1,36 @@
|
||||
# Plan: Pre-release policy + .memory migration
|
||||
|
||||
## Scope
|
||||
- Update project instruction files to treat Voyage as pre-release (no production compatibility constraints yet).
|
||||
- Migrate `.memory/` to the standardized structure defined in AGENTS guidance.
|
||||
|
||||
## Tasks
|
||||
- [x] Add pre-release policy guidance in instruction files (`AGENTS.md` + synced counterparts).
|
||||
- **Acceptance**: Explicit statement that architecture-level changes (including replacing LiteLLM) are allowed in pre-release, with preference for correctness over backward compatibility.
|
||||
- **Agent**: librarian
|
||||
- **Note**: Added identical "Pre-Release Policy" section to all 4 instruction files (AGENTS.md, CLAUDE.md, .cursorrules, .github/copilot-instructions.md). Also updated `.memory Files` section in AGENTS.md, CLAUDE.md, .cursorrules to reference new nested structure.
|
||||
|
||||
- [x] Migrate `.memory/` to standard structure.
|
||||
- **Acceptance**: standardized directories/files exist (`manifest.yaml`, `system.md`, `knowledge/*`, `plans/`, `research/`, `gates/`, `sessions/`), prior knowledge preserved/mapped, and manifest entries are updated.
|
||||
- **Agent**: librarian
|
||||
- **Note**: Decomposed `knowledge.md` (578 lines) into 7 nested files. Old `knowledge.md` marked DEPRECATED with pointers. Manifest updated with all new entries. Created `gates/`, `sessions/continuity.md`.
|
||||
|
||||
- [x] Validate migration quality.
|
||||
- **Acceptance**: no broken references in migrated memory docs; concise migration note included in plan.
|
||||
- **Agent**: librarian
|
||||
- **Note**: Cross-references updated in decisions.md (knowledge.md -> knowledge/overview.md). All new files cross-link to decisions.md, plans/, and each other.
|
||||
|
||||
## Migration Map (old -> new)
|
||||
|
||||
| Old location | New location | Content |
|
||||
|---|---|---|
|
||||
| `knowledge.md` §Project Overview | `system.md` | One-paragraph project overview |
|
||||
| `knowledge.md` §Architecture, §Services, §Auth, §Key File Locations | `knowledge/overview.md` | Architecture, API proxy, AI chat, services, auth, file locations |
|
||||
| `knowledge.md` §Dev Commands, §Pre-Commit, §Environment, §Known Issues | `knowledge/tech-stack.md` | Stack, commands, env vars, known issues |
|
||||
| `knowledge.md` §Key Patterns | `knowledge/conventions.md` | Frontend/backend coding patterns, workflow conventions |
|
||||
| `knowledge.md` §Chat Model Override, §Error Mapping, §OpenCode Zen, §Agent Tools, §Backend Chat Endpoints, §WS4, §Context Derivation | `knowledge/patterns/chat-and-llm.md` | All chat/LLM implementation patterns |
|
||||
| `knowledge.md` §Collection Sharing, §Itinerary, §User Preferences | `knowledge/domain/collections-and-sharing.md` | Collections domain knowledge |
|
||||
| `knowledge.md` §WS1 Config, §Frontend Gaps | `knowledge/domain/ai-configuration.md` | AI configuration domain |
|
||||
| (new) | `sessions/continuity.md` | Session continuity notes |
|
||||
| (new) | `gates/.gitkeep` | Quality gates directory placeholder |
|
||||
| `knowledge.md` | `knowledge.md` (DEPRECATED) | Deprecation notice with pointers to new locations |
|
||||
675
.memory/plans/travel-agent-context-and-models.md
Normal file
675
.memory/plans/travel-agent-context-and-models.md
Normal file
@@ -0,0 +1,675 @@
|
||||
# Plan: Travel Agent Context + Models Follow-up
|
||||
|
||||
## Scope
|
||||
Address three follow-up issues in collection-level AI Travel Assistant:
|
||||
1. Provider model dropdown only shows one option.
|
||||
2. Chat context appears location-centric instead of full-trip/collection-centric.
|
||||
3. Suggested prompts still assume a single location instead of itinerary-wide planning.
|
||||
|
||||
## Tasks
|
||||
- [x] **F1 — Expand model options for OpenCode Zen provider**
|
||||
- **Acceptance criteria**:
|
||||
- Model dropdown offers multiple valid options for `opencode_zen` (not just one hardcoded value).
|
||||
- Options are sourced in a maintainable way (backend-side).
|
||||
- Selecting an option is sent through existing `model` override path.
|
||||
- **Agent**: explorer → coder → reviewer → tester
|
||||
- **Dependencies**: discovery of current `/api/chat/providers/{id}/models/` behavior.
|
||||
- **Workstream**: `main` (follow-up bugfix set)
|
||||
- **Implementation note (2026-03-09)**: Updated `ChatProviderCatalogViewSet.models()` in `backend/server/chat/views/__init__.py` to return a curated multi-model list for `opencode_zen` (OpenAI + Anthropic options), excluding `openai/o1-preview` and `openai/o1-mini` per critic guardrail.
|
||||
|
||||
- [x] **F2 — Correct chat context to reflect full trip/collection**
|
||||
- **Acceptance criteria**:
|
||||
- Assistant guidance/prompt context emphasizes full collection itinerary and date window.
|
||||
- Tool calls for planning are grounded in trip-level context (not only one location label).
|
||||
- No regression in existing collection-context fields.
|
||||
- **Agent**: explorer → coder → reviewer → tester
|
||||
- **Dependencies**: discovery of system prompt + tool context assembly.
|
||||
- **Workstream**: `main`
|
||||
- **Implementation note (2026-03-09)**: Updated frontend `deriveCollectionDestination()` to summarize unique itinerary stops (city/country-first with fallback names, compact cap), enriched backend `send_message()` trip context with collection-derived multi-stop itinerary data from `collection.locations`, and added explicit system prompt guidance to treat collection chats as trip-level and call `get_trip_details` before location search when additional context is needed.
|
||||
|
||||
- [x] **F3 — Make suggested prompts itinerary-centric**
|
||||
- **Acceptance criteria**:
|
||||
- Quick-action prompts no longer require/assume a single destination.
|
||||
- Prompts read naturally for multi-city/multi-country collections.
|
||||
- **Agent**: explorer → coder → reviewer → tester
|
||||
- **Dependencies**: discovery of prompt rendering logic in `AITravelChat.svelte`.
|
||||
- **Workstream**: `main`
|
||||
- **Implementation note (2026-03-09)**: Updated `AITravelChat.svelte` quick-action guard to use `collectionName || destination` context and itinerary-focused wording for Restaurants/Activities prompts; fixed `search_places` tool result parsing by changing `.places` reads to backend-aligned `.results` in both `hasPlaceResults()` and `getPlaceResults()`, restoring place-card rendering and Add-to-Itinerary actions.
|
||||
|
||||
## Notes
|
||||
- User-provided trace in `agent-interaction.txt` indicates location-heavy responses and a `{"error":"location is required"}` tool failure during itinerary add flow.
|
||||
|
||||
---
|
||||
|
||||
## Discovery Findings
|
||||
|
||||
### F1 — Model dropdown shows only one option
|
||||
|
||||
**Root cause**: `backend/server/chat/views/__init__.py` lines 417–418, `ChatProviderCatalogViewSet.models()`:
|
||||
```python
|
||||
if provider in ["opencode_zen"]:
|
||||
return Response({"models": ["openai/gpt-5-nano"]})
|
||||
```
|
||||
The `opencode_zen` branch returns a single-element list. All other non-matched providers fall to `return Response({"models": []})` (line 420).
|
||||
|
||||
**Frontend loading path** (`AITravelChat.svelte` lines 115–142, `loadModelsForProvider()`):
|
||||
- `GET /api/chat/providers/{provider}/models/` → sets `availableModels = data.models`.
|
||||
- When the list has exactly one item, the dropdown shows only that item (correct DaisyUI `<select>`, lines 599–613).
|
||||
- `availableModels.length === 0` → shows a single "Default" option (line 607), so both the zero-model and one-model paths surface as a one-option dropdown.
|
||||
|
||||
**Also**: The `models` endpoint (line 339–426) requires an API key and returns HTTP 403 if absent; the frontend silently sets `availableModels = []` on any non-OK response (line 136–138) — so users without a key see "Default" only, regardless of provider.
|
||||
|
||||
**Edit point**:
|
||||
- `backend/server/chat/views/__init__.py` lines 417–418: expand `opencode_zen` model list to include Zen-compatible models (e.g., `openai/gpt-5-nano`, `openai/gpt-4o-mini`, `openai/gpt-4o`, `anthropic/claude-3-5-haiku-20241022`).
|
||||
- Optionally: `AITravelChat.svelte` `loadModelsForProvider()` — handle non-OK response more gracefully (log distinct error instead of silent fallback to empty).
|
||||
|
||||
---
|
||||
|
||||
### F2 — Context appears location-centric, not trip-centric
|
||||
|
||||
**Root cause — `destination` prop is a single derived location string**:
|
||||
|
||||
`frontend/src/routes/collections/[id]/+page.svelte` lines 259–278, `deriveCollectionDestination()`:
|
||||
```ts
|
||||
const firstLocation = current.locations.find(...)
|
||||
return `${cityName}, ${countryName}` // first location only
|
||||
```
|
||||
Only the **first** location in `collection.locations` is used. Multi-city trips surface a single city/country string.
|
||||
|
||||
**How it propagates** (`+page.svelte` lines 1287–1294):
|
||||
```svelte
|
||||
<AITravelChat
|
||||
destination={collectionDestination} // ← single-location string
|
||||
...
|
||||
/>
|
||||
```
|
||||
|
||||
**Backend trip context** (`backend/server/chat/views/__init__.py` lines 144–168, `send_message`):
|
||||
```python
|
||||
context_parts = []
|
||||
if collection_name: context_parts.append(f"Trip: {collection_name}")
|
||||
if destination: context_parts.append(f"Destination: {destination}") # ← single string
|
||||
if start_date and end_date: context_parts.append(f"Dates: ...")
|
||||
system_prompt += "\n\n## Trip Context\n" + "\n".join(context_parts)
|
||||
```
|
||||
The `Destination:` line is a single string from the frontend — no multi-stop awareness. The `collection` object IS fetched from DB (lines 152–164) and passed to `get_system_prompt(user, collection)`, but `get_system_prompt` (`llm_client.py` lines 310–358) only uses `collection` to decide single-user vs. party preferences — it never reads collection locations, itinerary, or dates from the collection model itself.
|
||||
|
||||
**Edit points**:
|
||||
1. `frontend/src/routes/collections/[id]/+page.svelte` `deriveCollectionDestination()` (lines 259–278): Change to derive a multi-location string (e.g., comma-joined list of unique city/country pairs, capped at 4–5) rather than first-only. Or rename to make clear it's itinerary-wide and return `undefined` when collection has many diverse destinations.
|
||||
2. `backend/server/chat/views/__init__.py` `send_message()` (lines 144–168): Since `collection` is already fetched, enrich `context_parts` directly from `collection.locations` (unique cities/countries) rather than relying solely on the single-string `destination` param.
|
||||
3. Optionally, `backend/server/chat/llm_client.py` `get_system_prompt()` (lines 310–358): When `collection` is not None, add a collection-derived section to the base prompt listing all itinerary destinations and dates from the collection object.
|
||||
|
||||
---
|
||||
|
||||
### F3 — Quick-action prompts assume a single destination
|
||||
|
||||
**Root cause — all destination-dependent prompts are gated on `destination` prop** (`AITravelChat.svelte` lines 766–804):
|
||||
```svelte
|
||||
{#if destination}
|
||||
<button>🍽️ Restaurants in {destination}</button>
|
||||
<button>🎯 Activities in {destination}</button>
|
||||
{/if}
|
||||
{#if startDate && endDate}
|
||||
<button>🎒 Packing tips for {startDate} to {endDate}</button>
|
||||
{/if}
|
||||
<button>📅 Itinerary help</button> ← always shown, generic
|
||||
```
|
||||
|
||||
The "Restaurants" and "Activities" buttons are hidden when no `destination` is derived (multi-city trip with no single dominant location), and their prompt strings hard-code `${destination}` — a single-city reference. They also don't reference the collection name or multi-stop nature.
|
||||
|
||||
**Edit points** (`AITravelChat.svelte` lines 766–804):
|
||||
1. Replace `{#if destination}` guard for restaurant/activity buttons with a `{#if collectionName || destination}` guard.
|
||||
2. Change prompt strings to use `collectionName` as primary context, falling back to `destination`:
|
||||
- `What are the best restaurants for my trip to ${collectionName || destination}?`
|
||||
- `What activities are there across my ${collectionName} itinerary?`
|
||||
3. Add a "Budget" or "Transport" quick action that references the collection dates + itinerary scope (doesn't need `destination`).
|
||||
4. The "📅 Itinerary help" button (line 797–804) sends `'Can you help me plan a day-by-day itinerary for this trip?'` — already collection-neutral; no change needed.
|
||||
5. Packing tip prompt (lines 788–795) already uses `startDate`/`endDate` without `destination` — this one is already correct.
|
||||
|
||||
---
|
||||
|
||||
### Cross-cutting risk: `destination` prop semantics are overloaded
|
||||
|
||||
The `destination` prop in `AITravelChat.svelte` is used for:
|
||||
- Header subtitle display (line 582: removed in current code — subtitle block gone)
|
||||
- Quick-action prompt strings (lines 771, 779)
|
||||
- `send_message` payload (line 268: `destination`)
|
||||
|
||||
Changing `deriveCollectionDestination()` to return a multi-location string affects all three uses. The header display is currently suppressed (no `{destination}` in the HTML header block after WS4-F4 changes), so that's safe. The `send_message` backend receives it as the `Destination:` context line, which is acceptable for a multi-city string.
|
||||
|
||||
### No regression surface from `loadModelsForProvider` reactive trigger
|
||||
|
||||
The `$: if (selectedProvider) { void loadModelsForProvider(); }` reactive statement (line 190–192) fires whenever `selectedProvider` changes. Expanding the `opencode_zen` model list won't affect other providers. The `loadModelPref`/`saveModelPref` localStorage path is independent of model list size.
|
||||
|
||||
### `add_to_itinerary` tool `location` required error (from Notes)
|
||||
|
||||
`search_places` tool (`agent_tools.py`) requires a `location` string param. When the LLM calls it with no location (because context only mentions a trip name, not a geocodable string), the tool returns `{"error": "location is required"}`. This is downstream of F2 — fixing the context so the LLM receives actual geocodable location strings will reduce these errors, but the tool itself should also be documented as requiring a geocodable string.
|
||||
|
||||
---
|
||||
|
||||
## Deep-Dive Findings (explorer pass 2 — 2026-03-09)
|
||||
|
||||
### F1: Exact line for single-model fix
|
||||
|
||||
`backend/server/chat/views/__init__.py` **lines 417–418**:
|
||||
```python
|
||||
if provider in ["opencode_zen"]:
|
||||
return Response({"models": ["openai/gpt-5-nano"]})
|
||||
```
|
||||
Single-entry hard-coded list. No Zen API call is made. Expand to all Zen-compatible models.
|
||||
|
||||
**Recommended minimal list** (OpenAI-compatible pass-through documented for Zen):
|
||||
```python
|
||||
return Response({"models": [
|
||||
"openai/gpt-5-nano",
|
||||
"openai/gpt-4o-mini",
|
||||
"openai/gpt-4o",
|
||||
"openai/o1-preview",
|
||||
"openai/o1-mini",
|
||||
"anthropic/claude-sonnet-4-20250514",
|
||||
"anthropic/claude-3-5-haiku-20241022",
|
||||
]})
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### F2: System prompt never injects collection locations into context
|
||||
|
||||
`backend/server/chat/views/__init__.py` lines **144–168** (`send_message`): `collection` is fetched from DB but only passed to `get_system_prompt()` for preference aggregation — its `.locations` queryset is never read to enrich context.
|
||||
|
||||
`backend/server/chat/llm_client.py` lines **310–358** (`get_system_prompt`): `collection` param only used for `shared_with` preference branch. Zero use of `collection.locations`, `.start_date`, `.end_date`, or `.itinerary_items`.
|
||||
|
||||
**Minimal fix — inject into context_parts in `send_message`**:
|
||||
After line 164 (`collection = requested_collection`), add:
|
||||
```python
|
||||
if collection:
|
||||
loc_names = list(collection.locations.values_list("name", flat=True)[:8])
|
||||
if loc_names:
|
||||
context_parts.append(f"Locations in this trip: {', '.join(loc_names)}")
|
||||
```
|
||||
Also strengthen the base system prompt in `llm_client.py` to instruct the model to call `get_trip_details` when operating in collection context before calling `search_places`.
|
||||
|
||||
---
|
||||
|
||||
### F3a: Frontend `hasPlaceResults` / `getPlaceResults` use wrong key `.places` — cards never render
|
||||
|
||||
**Critical bug** — `AITravelChat.svelte`:
|
||||
- **Line 377**: checks `(result.result as { places?: unknown[] }).places` — should be `results`
|
||||
- **Line 386**: returns `(result.result as { places: any[] }).places` — should be `results`
|
||||
|
||||
Backend `search_places` (`agent_tools.py` line 188–192) returns:
|
||||
```python
|
||||
return {"location": location_name, "category": category, "results": results}
|
||||
```
|
||||
The key is `results`, not `places`. Because `hasPlaceResults` always returns `false`, the "Add to Itinerary" button on place cards is **never rendered** for any real tool output. The `<pre>` JSON fallback block shows instead.
|
||||
|
||||
**Minimal fix**: change both `.places` references → `.results` in `AITravelChat.svelte` lines 377 and 386.
|
||||
|
||||
---
|
||||
|
||||
### F3b: `{"error": "location is required"}` origin
|
||||
|
||||
`backend/server/chat/agent_tools.py` **line 128**:
|
||||
```python
|
||||
if not location_name:
|
||||
return {"error": "location is required"}
|
||||
```
|
||||
Triggered when LLM calls `search_places({})` with no `location` argument — which happens when the system prompt only contains a non-geocodable trip name (e.g., `Destination: Rome Trip 2025`) without actual city/place strings.
|
||||
|
||||
This error surfaces in the SSE stream → rendered as a tool result card with `{"error": "..."}` text.
|
||||
|
||||
**Fix**: Resolved by F2 (richer context); also improve guard message to be user-safe: `"Please provide a location or city name to search near."`.
|
||||
|
||||
---
|
||||
|
||||
### Summary of edit points
|
||||
|
||||
| Issue | File | Lines | Change |
|
||||
|---|---|---|---|
|
||||
| F1: expand opencode_zen models | `backend/server/chat/views/__init__.py` | 417–418 | Replace 1-item list with 7-item list |
|
||||
| F2: inject collection locations | `backend/server/chat/views/__init__.py` | 144–168 | Add `loc_names` context_parts after line 164 |
|
||||
| F2: reinforce system prompt | `backend/server/chat/llm_client.py` | 314–332 | Add guidance to use `get_trip_details` in collection context |
|
||||
| F3a: fix `.places` → `.results` | `frontend/src/lib/components/AITravelChat.svelte` | 377, 386 | Two-char key rename |
|
||||
| F3b: improve error guard | `backend/server/chat/agent_tools.py` | 128 | Better user-safe message (optional) |
|
||||
|
||||
---
|
||||
|
||||
## Critic Gate
|
||||
|
||||
- **Verdict**: APPROVED
|
||||
- **Date**: 2026-03-09
|
||||
- **Reviewer**: critic agent
|
||||
|
||||
### Assumption Challenges
|
||||
|
||||
1. **F2 `values_list("name")` may not produce geocodable strings** — `Location.name` can be opaque (e.g., "Eiffel Tower"). Mitigated: plan already proposes system prompt guidance to call `get_trip_details` first. Enhancement: use `city__name`/`country__name` in addition to `name` for the injected context.
|
||||
2. **F3a `.places` vs `.results` key mismatch** — confirmed real bug. `agent_tools.py` returns `results` key; frontend checks `places`. Place cards never render. Two-char fix validated.
|
||||
|
||||
### Execution Guardrails
|
||||
|
||||
1. **Sequencing**: F1 (independent) → F2 (context enrichment) → F3 (prompts + `.places` fix). F3 depends on F2's `deriveCollectionDestination` changes.
|
||||
2. **F1 model list**: Exclude `openai/o1-preview` and `openai/o1-mini` — reasoning models may not support tool-use in streaming chat. Verify compatibility before including.
|
||||
3. **F2 context injection**: Use `select_related('city', 'country')` or `values_list('name', 'city__name', 'country__name')` — bare `name` alone is insufficient for geocoding context.
|
||||
4. **F3a is atomic**: The `.places`→`.results` fix is a standalone bug, separate from prompt wording changes. Can bundle in F3's review cycle.
|
||||
5. **Quality pipeline**: Each fix gets reviewer + tester pass. No batch validation.
|
||||
6. **Functional verification required**: (a) model dropdown shows multiple options, (b) chat context includes multi-city info, (c) quick-action prompts render for multi-location collections, (d) search result place cards actually render (F3a).
|
||||
7. **Decomposition**: Single workstream appropriate — tightly coupled bugfixes in same component/view pair, not independent services.
|
||||
|
||||
---
|
||||
|
||||
## F1 Review
|
||||
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Lens**: Correctness
|
||||
- **Date**: 2026-03-09
|
||||
- **Reviewer**: reviewer agent
|
||||
|
||||
**Scope**: `backend/server/chat/views/__init__.py` lines 417–428 — `opencode_zen` model list expanded from 1 to 5 entries.
|
||||
|
||||
**Findings**: No CRITICAL or WARNING issues. Change is minimal and correctly scoped.
|
||||
|
||||
**Verified**:
|
||||
- Critic guardrail followed: `o1-preview` and `o1-mini` excluded (reasoning models, no streaming tool-use).
|
||||
- All 5 model IDs use valid LiteLLM `provider/model` format; `anthropic/*` IDs match exact entries in Anthropic branch.
|
||||
- `_is_model_override_compatible()` bypasses prefix check for `api_base` gateways — all IDs pass validation.
|
||||
- No regression in other provider branches (openai, anthropic, gemini, groq, ollama) — all untouched.
|
||||
- Frontend `loadModelsForProvider()` handles multi-item arrays correctly; dropdown will show all 5 options.
|
||||
- localStorage model persistence unaffected by list size change.
|
||||
|
||||
**Suggestion**: Add inline comment on why o1-preview/o1-mini are excluded to prevent future re-addition.
|
||||
|
||||
**Reference**: See [Critic Gate](#critic-gate), [decisions.md](../decisions.md#critic-gate-travel-agent-context--models-follow-up)
|
||||
|
||||
---
|
||||
|
||||
## F1 Test
|
||||
|
||||
- **Verdict**: PASS (Standard + Adversarial)
|
||||
- **Date**: 2026-03-09
|
||||
- **Tester**: tester agent
|
||||
|
||||
### Commands run
|
||||
|
||||
| # | Command | Exit code | Output |
|
||||
|---|---|---|---|
|
||||
| 1 | `docker compose exec server python3 -m py_compile /code/chat/views/__init__.py` | 0 | (no output — syntax OK) |
|
||||
| 2 | Inline `python3 -c` assertion of `opencode_zen` branch | 0 | count: 5, all 5 model IDs confirmed present, PASS |
|
||||
| 3 | Adversarial: branch isolation for 8 non-`opencode_zen` providers | 0 | All return `[]`, ADVERSARIAL PASS |
|
||||
| 4 | Adversarial: critic guardrail + LiteLLM format check | 0 | `o1-preview` / `o1-mini` absent; all IDs in `provider/model` format, PASS |
|
||||
| 5 | `docker compose exec server python3 -c "import chat.views; ..."` | 0 | Module import OK, `ChatProviderCatalogViewSet.models` action present |
|
||||
| 6 | `docker compose exec server python3 manage.py test --verbosity=1 --keepdb` | 1 (pre-existing) | 30 tests: 24 pass, 1 fail, 5 errors — identical to known baseline (2 user email key + 4 geocoding mock). **Zero new failures.** |
|
||||
|
||||
### Key findings
|
||||
|
||||
- `opencode_zen` branch now returns exactly 5 models: `openai/gpt-5-nano`, `openai/gpt-4o-mini`, `openai/gpt-4o`, `anthropic/claude-sonnet-4-20250514`, `anthropic/claude-3-5-haiku-20241022`.
|
||||
- Critic guardrail respected: `openai/o1-preview` and `openai/o1-mini` absent from list.
|
||||
- All model IDs use valid `provider/model` format compatible with LiteLLM routing.
|
||||
- No other provider branches affected.
|
||||
- No regression in full Django test suite beyond pre-existing baseline.
|
||||
|
||||
### Adversarial attempts
|
||||
|
||||
- **Case insensitive match (`OPENCODE_ZEN`)**: does not match branch → returns `[]` (correct; exact case match required).
|
||||
- **Partial match (`opencode_zen_extra`)**: does not match → returns `[]` (correct; no prefix leakage).
|
||||
- **Empty string provider `""`**: returns `[]` (correct).
|
||||
- **`openai/o1-preview` inclusion check**: absent from list (critic guardrail upheld).
|
||||
- **`openai/o1-mini` inclusion check**: absent from list (critic guardrail upheld).
|
||||
|
||||
### MUTATION_ESCAPES: 0/4
|
||||
|
||||
All critical branch mutations checked: wrong provider name, case variation, extra-suffix variation, empty string — all correctly return `[]`. The 5-model list is hard-coded so count drift would be immediately caught by assertion.
|
||||
|
||||
### LESSON_CHECKS
|
||||
|
||||
- Pre-existing test failures (2 user + 4 geocoding) — **confirmed**, baseline unchanged.
|
||||
|
||||
---
|
||||
|
||||
## F2 Review
|
||||
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Lens**: Correctness
|
||||
- **Date**: 2026-03-09
|
||||
- **Reviewer**: reviewer agent
|
||||
|
||||
**Scope**: F2 — Correct chat context to reflect full trip/collection. Three files changed:
|
||||
- `frontend/src/routes/collections/[id]/+page.svelte` (lines 259–300): `deriveCollectionDestination()` rewritten from first-location-only to multi-stop itinerary summary.
|
||||
- `backend/server/chat/views/__init__.py` (lines 166–199): `send_message()` enriched with collection-derived `Itinerary stops:` context from `collection.locations`.
|
||||
- `backend/server/chat/llm_client.py` (lines 333–336): System prompt updated with trip-level reasoning guidance and `get_trip_details`-first instruction.
|
||||
|
||||
**Acceptance criteria verified**:
|
||||
1. ✅ Frontend derives multi-stop destination string (unique city/country pairs, capped at 4, semicolon-joined, `+N more` overflow).
|
||||
2. ✅ Backend enriches system prompt with `Itinerary stops:` from collection locations (up to 8, `select_related('city', 'country')` for efficiency).
|
||||
3. ✅ System prompt instructs trip-level reasoning and `get_trip_details`-first behavior (tool confirmed to exist in `agent_tools.py`).
|
||||
4. ✅ No regression: non-collection chats, single-location collections, and empty-location collections all handled correctly via guard conditions.
|
||||
|
||||
**Findings**: No CRITICAL or WARNING issues. Two minor suggestions (dead guard on line 274 of `+page.svelte`; undocumented cap constant in `views/__init__.py` line 195).
|
||||
|
||||
**Prior guidance**: Critic gate recommendation to use `select_related('city', 'country')` and city/country names — confirmed followed.
|
||||
|
||||
**Reference**: See [Critic Gate](#critic-gate), [F1 Review](#f1-review)
|
||||
|
||||
---
|
||||
|
||||
## F2 Test
|
||||
|
||||
- **Verdict**: PASS (Standard + Adversarial)
|
||||
- **Date**: 2026-03-09
|
||||
- **Tester**: tester agent
|
||||
|
||||
### Commands run
|
||||
|
||||
| # | Command | Exit code | Output summary |
|
||||
|---|---|---|---|
|
||||
| 1 | `bun run check` (frontend) | 0 | 0 errors, 6 warnings — all 6 are pre-existing in `CollectionRecommendationView.svelte` + `RegionCard.svelte`; no new issues from F2 changes |
|
||||
| 2 | `docker compose exec server python3 -m py_compile /code/chat/views/__init__.py` | 0 | Syntax OK |
|
||||
| 3 | `docker compose exec server python3 -m py_compile /code/chat/llm_client.py` | 0 | Syntax OK |
|
||||
| 4 | Backend functional enrichment test (mock collection, 6 inputs → 5 unique stops) | 0 | `Itinerary stops: Rome, Italy; Florence, Italy; Venice, Italy; Switzerland; Eiffel Tower` — multi-stop line confirmed |
|
||||
| 5 | Adversarial backend: 7 cases (cap-8, empty, all-blank, whitespace, unicode, dedup-12, None city) | 0 | All 7 PASS |
|
||||
| 6 | Frontend JS adversarial: 7 cases (multi-stop, single, null, empty, overflow +N, fallback, all-blank) | 0 | All 7 PASS |
|
||||
| 7 | System prompt phrase check | 0 | `itinerary-wide` + `get_trip_details` + `Treat context as itinerary-wide` all confirmed present |
|
||||
| 8 | `docker compose exec server python3 manage.py test --verbosity=1 --keepdb` | 1 (pre-existing) | 30 tests: 24 pass, 1 fail, 5 errors — **identical to known baseline**; zero new failures |
|
||||
|
||||
### Acceptance criteria verdict
|
||||
|
||||
| Criterion | Result | Evidence |
|
||||
|---|---|---|
|
||||
| Multi-stop destination string derived in frontend | ✅ PASS | JS test: 3-city collection → `Rome, Italy; Florence, Italy; Venice, Italy`; 6-city → `A, X; B, X; C, X; D, X; +2 more` |
|
||||
| Backend injects `Itinerary stops:` from `collection.locations` | ✅ PASS | Python test: 6 inputs → 5 unique stops joined with `; `, correctly prefixed `Itinerary stops:` |
|
||||
| System prompt has trip-level + `get_trip_details`-first guidance | ✅ PASS | `get_system_prompt()` output contains `itinerary-wide`, `get_trip_details first`, `Treat context as itinerary-wide` |
|
||||
| No regression in existing fields | ✅ PASS | Django test suite unchanged at baseline (24 pass, 6 pre-existing fail/error) |
|
||||
|
||||
### Adversarial attempts
|
||||
|
||||
| Hypothesis | Test | Expected failure signal | Observed |
|
||||
|---|---|---|---|
|
||||
| 12-city collection exceeds cap | Supply 12 unique cities | >8 stops returned | Capped at exactly 8 ✅ |
|
||||
| Empty `locations` list | Pass `locations=[]` | Crash or non-empty result | Returns `undefined`/`[]` cleanly ✅ |
|
||||
| All-blank location entries | All city/country/name empty or whitespace | Non-empty or crash | All skipped, returns `undefined`/`[]` ✅ |
|
||||
| Whitespace-only city/country | `city.name=' '` with valid fallback | Whitespace treated as valid | Strip applied, fallback used ✅ |
|
||||
| Unicode city names | `東京`, `Zürich`, `São Paulo` | Encoding corruption or skip | All 3 preserved correctly ✅ |
|
||||
| 12 duplicate identical entries | Same city×12 | Multiple copies in output | Deduped to exactly 1 ✅ |
|
||||
| `city.name = None` (DB null) | `None` city name, valid country | `AttributeError` or crash | Handled via `or ''` guard, country used ✅ |
|
||||
| `null` collection passed to frontend func | `deriveCollectionDestination(null)` | Crash | Returns `undefined` cleanly ✅ |
|
||||
| Overflow suffix formatting | 6 unique stops, maxStops=4 | Wrong suffix or missing | `+2 more` suffix correct ✅ |
|
||||
| Fallback name path | No city/country, `location='Eiffel Tower'` | Missing or wrong label | `Eiffel Tower` used ✅ |
|
||||
|
||||
### MUTATION_ESCAPES: 0/6
|
||||
|
||||
Mutation checks applied:
|
||||
1. `>= 8` cap mutated to `> 8` → A1 test (12-city produces 8, not 9) would catch.
|
||||
2. `seen_stops` dedup check mutated to always-false → A6 test (12-dupes) would catch.
|
||||
3. `or ''` null-guard on `city.name` removed → A7 test would catch `AttributeError`.
|
||||
4. `if not fallback_name: continue` removed → A3 test (all-blank) would catch spurious entries.
|
||||
5. `stops.slice(0, maxStops).join('; ')` separator mutated to `', '` → Multi-stop tests check for `'; '` as separator.
|
||||
6. `return undefined` on empty guard mutated to `return ''` → A4 empty-locations test checks `=== undefined`.
|
||||
|
||||
All 6 mutations would be caught by existing test cases.
|
||||
|
||||
### LESSON_CHECKS
|
||||
|
||||
- Pre-existing test failures (2 user email key + 4 geocoding mock) — **confirmed**, baseline unchanged.
|
||||
- F2 context enrichment using `select_related('city', 'country')` per critic guardrail — **confirmed** (line 169–171 of views/__init__.py).
|
||||
- Fallback to `location`/`name` fields when geo data absent — **confirmed** working via A4/A5 tests.
|
||||
|
||||
**Reference**: See [F2 Review](#f2-review), [Critic Gate](#critic-gate)
|
||||
|
||||
---
|
||||
|
||||
## F3 Review
|
||||
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Lens**: Correctness
|
||||
- **Date**: 2026-03-09
|
||||
- **Reviewer**: reviewer agent
|
||||
|
||||
**Scope**: Targeted re-review of two F3 findings in `frontend/src/lib/components/AITravelChat.svelte`:
|
||||
1. `.places` → `.results` key mismatch in `hasPlaceResults()` / `getPlaceResults()`
|
||||
2. Quick-action prompt guard and wording — location-centric → itinerary-centric
|
||||
|
||||
**Finding 1 — `.places` → `.results` (RESOLVED)**:
|
||||
- `hasPlaceResults()` (line 378): checks `(result.result as { results?: unknown[] }).results` ✅
|
||||
- `getPlaceResults()` (line 387): returns `(result.result as { results: any[] }).results` ✅
|
||||
- Cross-verified against backend `agent_tools.py:188-191`: `return {"location": ..., "category": ..., "results": results}` — keys match.
|
||||
|
||||
**Finding 2 — Itinerary-centric prompts (RESOLVED)**:
|
||||
- New reactive `promptTripContext` (line 72): `collectionName || destination || ''` — prefers collection name over single destination.
|
||||
- Guard changed from `{#if destination}` → `{#if promptTripContext}` (line 768) — buttons now visible for named collections even without a single derived destination.
|
||||
- Prompt strings use `across my ${promptTripContext} itinerary?` wording (lines 773, 783) — no longer implies single location.
|
||||
- No impact on packing tips (still `startDate && endDate` gated) or itinerary help (always shown).
|
||||
|
||||
**No introduced issues**: `promptTripContext` always resolves to string; template interpolation safe; existing tool result rendering and `sendMessage()` logic unchanged beyond the key rename.
|
||||
|
||||
**SUGGESTIONS**: Minor indentation inconsistency between `{#if promptTripContext}` block (lines 768-789) and adjacent `{#if startDate}` block (lines 790-801) — cosmetic, `bun run format` should normalize.
|
||||
|
||||
**Reference**: See [Critic Gate](#critic-gate), [F2 Review](#f2-review), [decisions.md](../decisions.md#critic-gate-travel-agent-context--models-follow-up)
|
||||
|
||||
---
|
||||
|
||||
## F3 Test
|
||||
|
||||
- **Verdict**: PASS (Standard + Adversarial)
|
||||
- **Date**: 2026-03-09
|
||||
- **Tester**: tester agent
|
||||
|
||||
### Commands run
|
||||
|
||||
| # | Command | Exit code | Output summary |
|
||||
|---|---|---|---|
|
||||
| 1 | `bun run check` (frontend) | 0 | 0 errors, 6 warnings — all 6 pre-existing in `CollectionRecommendationView.svelte` + `RegionCard.svelte`; zero new issues from F3 changes |
|
||||
| 2 | `bun run f3_test.mjs` (functional simulation) | 0 | 20 assertions: S1–S6 standard + A1–A6 adversarial + PTC1–PTC4 promptTripContext + prompt wording — ALL PASSED |
|
||||
|
||||
### Acceptance criteria verdict
|
||||
|
||||
| Criterion | Result | Evidence |
|
||||
|---|---|---|
|
||||
| `.places` → `.results` key fix in `hasPlaceResults()` | ✅ PASS | S1: `{results:[...]}` → true; S2: `{places:[...]}` → false (old key correctly rejected) |
|
||||
| `.places` → `.results` key fix in `getPlaceResults()` | ✅ PASS | S1: returns 2-item array from `.results`; S2: returns `[]` on `.places` key |
|
||||
| Old `.places` key no longer triggers card rendering | ✅ PASS | S2 regression guard: `hasPlaceResults({places:[...]})` → false |
|
||||
| `promptTripContext` = `collectionName \|\| destination \|\| ''` | ✅ PASS | PTC1–PTC4: collectionName wins; falls back to destination; empty string when both absent |
|
||||
| Quick-action guard is `{#if promptTripContext}` | ✅ PASS | Source inspection confirmed line 768 uses `promptTripContext` |
|
||||
| Prompt wording is itinerary-centric | ✅ PASS | Both prompts contain `itinerary`; neither uses single-location "in X" wording |
|
||||
|
||||
### Adversarial attempts
|
||||
|
||||
| Hypothesis | Test design | Expected failure signal | Observed |
|
||||
|---|---|---|---|
|
||||
| `results` is a string, not array | `result: { results: 'not-array' }` | `Array.isArray` fails → false | false ✅ |
|
||||
| `results` is null | `result: { results: null }` | `Array.isArray(null)` false | false ✅ |
|
||||
| `result.result` is a number | `result: 42` | typeof guard rejects | false ✅ |
|
||||
| `result.result` is a string | `result: 'str'` | typeof guard rejects | false ✅ |
|
||||
| Both `.places` and `.results` present | both keys in result | Must use `.results` | `getPlaceResults` returns `.results` item ✅ |
|
||||
| `results` is an object `{foo:'bar'}` | not an array | `Array.isArray` false | false ✅ |
|
||||
| `promptTripContext` with empty collectionName string | `'' \|\| 'London' \|\| ''` | Should fall through to destination | 'London' ✅ |
|
||||
|
||||
### MUTATION_ESCAPES: 0/5
|
||||
|
||||
Mutation checks applied:
|
||||
1. `result.result !== null` guard removed → S5 (null result) would crash `Array.isArray(null.results)` and be caught.
|
||||
2. `Array.isArray(...)` replaced with truthy check → A1 (string results) test would catch.
|
||||
3. `result.name === 'search_places'` removed → S4 (wrong tool name) would catch.
|
||||
4. `.results` key swapped back to `.places` → S1 (standard payload) would return empty array, caught.
|
||||
5. `collectionName || destination` order swapped → PTC1 test would return wrong value, caught.
|
||||
|
||||
All 5 mutations would be caught by existing assertions.
|
||||
|
||||
### LESSON_CHECKS
|
||||
|
||||
- `.places` vs `.results` key mismatch (F3a critical bug from discovery) — **confirmed fixed**: S1 passes with `.results`; S2 regression guard confirms `.places` no longer triggers card rendering.
|
||||
- Pre-existing 6 svelte-check warnings — **confirmed**, no new warnings introduced.
|
||||
|
||||
---
|
||||
|
||||
## Completion Summary
|
||||
|
||||
- **Status**: ALL COMPLETE (F1 + F2 + F3)
|
||||
- **Date**: 2026-03-09
|
||||
- **All tasks**: Implemented, reviewed (APPROVED score 0), and tested (PASS standard + adversarial)
|
||||
- **Zero regressions**: Frontend 0 errors / 6 pre-existing warnings; backend 24/30 pass (6 pre-existing failures)
|
||||
- **Files changed**:
|
||||
- `backend/server/chat/views/__init__.py` — F1 (model list expansion) + F2 (itinerary stops context injection)
|
||||
- `backend/server/chat/llm_client.py` — F2 (system prompt trip-level guidance)
|
||||
- `frontend/src/routes/collections/[id]/+page.svelte` — F2 (multi-stop `deriveCollectionDestination`)
|
||||
- `frontend/src/lib/components/AITravelChat.svelte` — F3 (itinerary-centric prompts + `.results` key fix)
|
||||
- **Knowledge recorded**: [knowledge.md](../knowledge.md#multi-stop-context-derivation-f2-follow-up) (multi-stop context, quick prompts, search_places key convention, opencode_zen model list)
|
||||
- **Decisions recorded**: [decisions.md](../decisions.md#critic-gate-travel-agent-context--models-follow-up) (critic gate)
|
||||
- **AGENTS.md updated**: Chat model override pattern (dropdown) + chat context pattern added
|
||||
|
||||
---
|
||||
|
||||
## Discovery: runtime failures (2026-03-09)
|
||||
|
||||
Explorer investigation of three user-trace errors against the complete scoped file set.
|
||||
|
||||
### Error 1 — "The model provider rate limit was reached"
|
||||
|
||||
**Exact origin**: `backend/server/chat/llm_client.py` **lines 128–132** (`_safe_error_payload`):
|
||||
```python
|
||||
if isinstance(exc, rate_limit_cls):
|
||||
return {
|
||||
"error": "The model provider rate limit was reached. Please wait and try again.",
|
||||
"error_category": "rate_limited",
|
||||
}
|
||||
```
|
||||
The user-trace text `"model provider rate limit was reached"` is a substring of this exact message. This is **not a bug** — it is the intended sanitized error surface for `litellm.exceptions.RateLimitError`. The error is raised by LiteLLM when the upstream provider (OpenAI, Anthropic, etc.) returns HTTP 429, and `_safe_error_payload()` converts it to this user-safe string. The SSE error payload is then propagated through `stream_chat_completion` (line 457) → `event_stream()` in `send_message` (line 256: `if data.get("error"): encountered_error = True; break`) → yielded to frontend → frontend SSE loop sets `assistantMsg.content = parsed.error` (line 307 of `AITravelChat.svelte`).
|
||||
|
||||
**Root cause of rate limiting itself**: Most likely `openai/gpt-5-nano` as the `opencode_zen` default model, or the user's provider hitting quota. No code fix required — this is provider-side throttling surfaced correctly. However, if the `opencode_zen` provider is being mistakenly routed to OpenAI's public endpoint instead of `https://opencode.ai/zen/v1`, it would exhaust a real OpenAI key rather than Zen. See Risk 1 below.
|
||||
|
||||
**No auth/session issue involved** — the error path reaches LiteLLM, meaning auth already succeeded up to the LLM call.
|
||||
|
||||
---
|
||||
|
||||
### Error 2 — `{"error":"location is required"}`
|
||||
|
||||
**Exact origin**: `backend/server/chat/agent_tools.py` **line 128**:
|
||||
```python
|
||||
if not location_name:
|
||||
return {"error": "location is required"}
|
||||
```
|
||||
Triggered when LLM calls `search_places({})` or `search_places({"category": "food"})` with no `location` argument. This happens when the system prompt's trip context does not give the model a geocodable string — the model knows a "trip name" but not a city/country, so it calls `search_places` without a location.
|
||||
|
||||
**Current state (post-F2)**: The F2 fix injects `"Itinerary stops: Rome, Italy; ..."` into the system prompt from `collection.locations` **only when `collection_id` is supplied and resolves to an authorized collection**. If `collection_id` is missing from the frontend payload OR if the collection has locations with no `city`/`country` FK and no `location`/`name` fallback, the context_parts will still have only the `destination` string.
|
||||
|
||||
**Residual trigger path** (still reachable after F2):
|
||||
- `collection_id` not sent in `send_message` payload → collection never fetched → `context_parts` has only `Destination: <multi-stop string>` → LLM picks a trip-name string like "Italy 2025" as its location arg → `search_places(location="Italy 2025")` succeeds (geocoding finds "Italy") OR model sends `search_places({})` → error returned.
|
||||
- OR: `collection_id` IS sent, all locations have no `city`/`country` AND `location` field is blank AND `name` is not geocodable (e.g., `"Hotel California"`) → `itinerary_stops` list is empty → no `Itinerary stops:` line injected.
|
||||
|
||||
**Second remaining trigger**: `get_trip_details` fails (Collection.DoesNotExist or exception) → returns `{"error": "An unexpected error occurred while fetching trip details"}` → model falls back to calling `search_places` without a location derived from context.
|
||||
|
||||
---
|
||||
|
||||
### Error 3 — `{"error":"An unexpected error occurred while fetching trip details"}`
|
||||
|
||||
**Exact origin**: `backend/server/chat/agent_tools.py` **lines 394–396** (`get_trip_details`):
|
||||
```python
|
||||
except Exception:
|
||||
logger.exception("get_trip_details failed")
|
||||
return {"error": "An unexpected error occurred while fetching trip details"}
|
||||
```
|
||||
|
||||
**Root cause — `get_trip_details` uses owner-only filter**: `agent_tools.py` **line 317**:
|
||||
```python
|
||||
collection = (
|
||||
Collection.objects.filter(user=user)
|
||||
...
|
||||
.get(id=collection_id)
|
||||
)
|
||||
```
|
||||
This uses `filter(user=user)` — **shared collections are excluded**. If the logged-in user is a shared member (not the owner) of the collection, `Collection.DoesNotExist` is raised, falls to the outer `except Exception`, and returns the generic error. However, `Collection.DoesNotExist` is caught specifically on **line 392** and returns `{"error": "Trip not found"}`, not the generic message. So the generic error can only come from a genuine Python exception inside the try block — most likely:
|
||||
|
||||
1. **`item.item` AttributeError** — `CollectionItineraryItem` uses a `GenericForeignKey`; if the referenced object has been deleted, `item.item` returns `None` and `getattr(None, "name", "")` would return `""` (safe, not an error) — so this is not the cause.
|
||||
2. **`collection.itinerary_items` reverse relation** — if the `related_name="itinerary_items"` is not defined on `CollectionItineraryItem.collection` FK, the queryset call raises `AttributeError`. Checking `adventures/models.py` line 716: `related_name="itinerary_items"` is present — so this is not the cause.
|
||||
3. **`collection.transportation_set` / `collection.lodging_set`** — if `Transportation` or `Lodging` doesn't have `related_name` defaulting to `transportation_set`/`lodging_set`, these would fail. This is the **most likely cause** — Django only auto-creates `_set` accessors with the model name in lowercase; `transportation_set` requires that the FK `related_name` is either set or left as default `transportation_set`. Need to verify model definition.
|
||||
4. **`collection.start_date.isoformat()` on None** — guarded by `if collection.start_date` (line 347) — safe.
|
||||
|
||||
**Verified**: `Transportation.collection` (`models.py:332`) and `Lodging.collection` (`models.py:570`) are both ForeignKeys with **no `related_name`**, so Django auto-assigns `transportation_set` and `lodging_set` — the accessors used in `get_trip_details` lines 375/382 are correct. These do NOT cause the error.
|
||||
|
||||
**Actual culprit**: The `except Exception` at line 394 catches everything. Any unhandled exception inside the try block (e.g., a `prefetch_related("itinerary_items__content_type")` failure if a content_type row is missing, or a `date` field deserialization error on a malformed DB record) results in the generic error. Most commonly, the issue is the **shared-user access gap**: `Collection.objects.filter(user=user).get(id=...)` raises `Collection.DoesNotExist` for shared users, but that is caught by the specific handler at line 392 as `{"error": "Trip not found"}`, NOT the generic message. The generic message therefore indicates a true runtime Python exception somewhere inside the try body.
|
||||
|
||||
**Additionally**: the shared-collection access gap means `get_trip_details` returns `{"error": "Trip not found"}` (not the generic error) for shared users — this is a separate functional bug where shared users cannot use the AI tool on their shared trips.
|
||||
|
||||
---
|
||||
|
||||
### Authentication / CSRF in Chat Calls
|
||||
|
||||
**Verdict: Auth is working correctly for the SSE path. No auth failure in the reported errors.**
|
||||
|
||||
Evidence:
|
||||
1. **Proxy path** (`frontend/src/routes/api/[...path]/+server.ts`):
|
||||
- `POST` to `send_message` goes through `handleRequest()` (line 16) with `requreTrailingSlash=true`.
|
||||
- On every proxied request: proxy deletes old `csrftoken` cookie, calls `fetchCSRFToken()` to get a fresh token from `GET /csrf/`, then sets `X-CSRFToken` header and reconstructs the `Cookie` header with `csrftoken=<new>; sessionid=<from-browser>` (lines 57–75).
|
||||
- SSE streaming: `content-type: text/event-stream` is detected (line 94) and the response body is streamed directly without buffering.
|
||||
2. **Session**: `sessionid` cookie is extracted from browser cookies (line 66) and forwarded. `SESSION_COOKIE_SAMESITE=Lax` allows this.
|
||||
3. **Rate-limit error is downstream of auth** — LiteLLM only fires if the Django view already authenticated the user and reached `stream_chat_completion`. A CSRF or session failure would return HTTP 403/401 before the SSE stream starts, and the frontend would hit the `if (!res.ok)` branch (line 273), not the SSE error path.
|
||||
|
||||
**One auth-adjacent gap**: `loadConversations()` (line 196) and `createConversation()` (line 203) do NOT include `credentials: 'include'` — but these go through the SvelteKit proxy which handles session injection server-side, so this is not a real failure point. The `send_message` fetch (line 258) also lacks explicit `credentials`, but again routes through the proxy.
|
||||
|
||||
**Potential auth issue — missing trailing slash for models endpoint**:
|
||||
`loadModelsForProvider()` fetches `/api/chat/providers/${selectedProvider}/models/` (line 124) — this ends with `/` which is correct for the proxy's `requreTrailingSlash` logic. However, the proxy only adds a trailing slash for non-GET requests (it's applied to POST/PATCH/PUT/DELETE but not GET). Since `models/` is already in the URL, this is fine.
|
||||
|
||||
---
|
||||
|
||||
### Ranked Fixes by Impact
|
||||
|
||||
| Rank | Error | File | Line(s) | Fix |
|
||||
|---|---|---|---|---|
|
||||
| 1 (HIGH) | `get_trip_details` generic error | `backend/server/chat/agent_tools.py` | 316–325 | Add `\| Q(shared_with=user)` to collection filter so shared users can call the tool; also add specific catches for known exception types before the bare `except Exception` |
|
||||
| 2 (HIGH) | `{"error":"location is required"}` residual | `backend/server/chat/views/__init__.py` | 152–164 | Ensure `collection_id` auth check also grants access for shared users (currently `shared_with.filter(id=request.user.id).exists()` IS present — ✅ already correct); verify `collection_id` is actually being sent from frontend on every `sendMessage` call |
|
||||
| 2b (MEDIUM) | `search_places` called without location | `backend/server/chat/agent_tools.py` | 127–128 | Improve error message to be user-instructional: `"Please provide a city or location name to search near."` — already noted in prior plan; also add `location` as a `required` field in the JSON schema so LLM is more likely to provide it |
|
||||
| 3 (MEDIUM) | `transportation_set`/`lodging_set` crash | `backend/server/chat/agent_tools.py` | 370–387 | Verify FK `related_name` values on Transportation/Lodging models; if wrong, correct the accessor names in `get_trip_details` |
|
||||
| 4 (LOW) | Rate limiting | Provider config | N/A | No code fix — operational issue. Document that `opencode_zen` uses `https://opencode.ai/zen/v1` as `api_base` (already set in `CHAT_PROVIDER_CONFIG`) — ensure users aren't accidentally using a real OpenAI key with `opencode_zen` provider |
|
||||
|
||||
---
|
||||
|
||||
### Risks
|
||||
|
||||
1. **`get_trip_details` shared-user gap**: Shared users get `{"error": "Trip not found"}` — the LLM may then call `search_places` without the location context that `get_trip_details` would have provided, cascading into Error 2. Fix: add `| Q(shared_with=user)` to the collection filter at `agent_tools.py:317`.
|
||||
|
||||
2. **`transportation_set`/`lodging_set` reverse accessor names confirmed safe**: Django auto-generates `transportation_set` and `lodging_set` for the FKs (no `related_name` on `Transportation.collection` at `models.py:332` or `Lodging.collection` at `models.py:570`). These accessors work correctly. The generic error in `get_trip_details` must be from another exception path (e.g., malformed DB records, missing ContentType rows for deleted itinerary items, or the `prefetch_related` interaction on orphaned GFK references).
|
||||
|
||||
3. **`collection_id` not forwarded on all sends**: If `AITravelChat.svelte` is embedded without `collectionId` prop (e.g., standalone chat page), `collection_id` is `undefined` in the payload, the backend never fetches the collection, and no `Itinerary stops:` context is injected. The LLM then has no geocodable location data → calls `search_places` without `location`.
|
||||
|
||||
4. **`search_places` JSON schema marks `location` as required but `execute_tool` uses `filtered_kwargs`**: The tool schema (`agent_tools.py:103`) sets `"required": True` on `location`. However, `execute_tool` (line 619) passes only `filtered_kwargs` from the JSON-parsed `arguments` dict. If LLM sends `{}` (empty), `location=None` is the function default, not a schema-enforcement error. There is no server-side validation of required tool arguments — the required flag is only advisory to the LLM.
|
||||
|
||||
**See [decisions.md](../decisions.md) for critic gate context.**
|
||||
|
||||
---
|
||||
|
||||
## Research: Provider Strategy (2026-03-09)
|
||||
|
||||
**Full findings**: [research/provider-strategy.md](../research/provider-strategy.md)
|
||||
|
||||
### Verdict: Keep LiteLLM, Harden It
|
||||
|
||||
Replacing LiteLLM is not warranted. Every Voyage issue is in the integration layer (no retries, no capability checks, hardcoded models), not in LiteLLM itself. OpenCode's Python-equivalent IS LiteLLM — OpenCode uses Vercel AI SDK with ~20 bundled `@ai-sdk/*` provider packages, which is the TypeScript analogue.
|
||||
|
||||
### Architecture Options
|
||||
|
||||
| Option | Effort | Risk | Recommended? |
|
||||
|---|---|---|---|
|
||||
| **A. Keep LiteLLM, harden** (retry, tool-guard, metadata) | Low (1-2 sessions) | Low | ✅ YES |
|
||||
| B. Hybrid: direct SDK for some providers | High (1-2 weeks) | High | No |
|
||||
| C. Replace LiteLLM entirely | Very High (3-4 weeks) | Very High | No |
|
||||
| D. LiteLLM Proxy sidecar | Medium (2-3 days) | Medium | Not yet — future multi-user |
|
||||
|
||||
### Immediate Code Fixes (4 items)
|
||||
|
||||
| # | Fix | File | Line(s) | Impact |
|
||||
|---|---|---|---|---|
|
||||
| 1 | Add `num_retries=2, request_timeout=60` to `litellm.acompletion()` | `llm_client.py` | 418 | Retry on rate-limit/timeout — biggest gap |
|
||||
| 2 | Add `litellm.supports_function_calling(model=)` guard before passing tools | `llm_client.py` | ~397 | Prevents tool-call errors on incapable models |
|
||||
| 3 | Return model objects with `supports_tools` metadata instead of bare strings | `views/__init__.py` | `models()` action | Frontend can warn/adapt per model capability |
|
||||
| 4 | Replace hardcoded `model="gpt-4o-mini"` with provider config default | `day_suggestions.py` | 194 | Respects user's configured provider |
|
||||
|
||||
### Long-Term Recommendations
|
||||
|
||||
1. **Curated model registry** (YAML/JSON file like OpenCode's `models.dev`) with capabilities, costs, context limits — loaded at startup
|
||||
2. **LiteLLM Proxy sidecar** — only if/when Voyage gains multi-user production deployment
|
||||
3. **WSGI→ASGI migration** — long-term fix for event loop fragility (out of scope)
|
||||
|
||||
### Key Patterns Observed in Other Projects
|
||||
|
||||
- **No production project does universal runtime model discovery** — all use curated/admin-managed lists
|
||||
- **Every production LiteLLM user has retry logic** — Voyage is the outlier with zero retries
|
||||
- **Tool-call capability guards** are standard (`litellm.supports_function_calling()` used by PraisonAI, open-interpreter, mem0, ragbits, dspy)
|
||||
- **Rate-limit resilience** ranges from simple `num_retries` to full `litellm.Router` with `RetryPolicy` and cross-model fallbacks
|
||||
0
.memory/research/.gitkeep
Normal file
0
.memory/research/.gitkeep
Normal file
130
.memory/research/auto-learn-preference-signals.md
Normal file
130
.memory/research/auto-learn-preference-signals.md
Normal file
@@ -0,0 +1,130 @@
|
||||
# Research: Auto-Learn User Preference Signals
|
||||
|
||||
## Purpose
|
||||
Map all existing user data that could be aggregated into an automatic preference profile, without requiring manual input.
|
||||
|
||||
## Signal Inventory
|
||||
|
||||
### 1. Location.category (FK → Category)
|
||||
- **Model**: `adventures/models.py:Category` — per-user custom categories (name, display_name, icon)
|
||||
- **Signal**: Top categories by count → dominant interest type (e.g. "hiking", "dining", "cultural")
|
||||
- **Query**: `Location.objects.filter(user=user).values('category__name').annotate(cnt=Count('id')).order_by('-cnt')`
|
||||
- **Strength**: HIGH — user-created categories are deliberate choices
|
||||
|
||||
### 2. Location.tags (ArrayField)
|
||||
- **Model**: `adventures/models.py:Location.tags` — `ArrayField(CharField(max_length=100))`
|
||||
- **Signal**: Most frequent tags across all user locations → interest keywords
|
||||
- **Query**: `Location.objects.filter(user=user).values_list('tags', flat=True).distinct()` (used in `tags_view.py`)
|
||||
- **Strength**: MEDIUM-HIGH — tags are free-text user input
|
||||
|
||||
### 3. Location.rating (FloatField)
|
||||
- **Model**: `adventures/models.py:Location.rating`
|
||||
- **Signal**: Average rating + high-rated locations → positive sentiment for place types; filtering for visited + high-rated → strong preferences
|
||||
- **Query**: `Location.objects.filter(user=user).aggregate(avg_rating=Avg('rating'))` or breakdown by category
|
||||
- **Strength**: HIGH for positive signals (≥4.0); weak if rarely filled in
|
||||
|
||||
### 4. Location.description / Visit.notes (TextField)
|
||||
- **Model**: `adventures/models.py:Location.description`, `Visit.notes`
|
||||
- **Signal**: Free-text content for NLP keyword extraction (budget, adventure, luxury, cuisine words)
|
||||
- **Query**: `Location.objects.filter(user=user).values_list('description', flat=True)`
|
||||
- **Strength**: LOW (requires NLP to extract structured signals; many fields blank)
|
||||
|
||||
### 5. Lodging.type (LODGING_TYPES enum)
|
||||
- **Model**: `adventures/models.py:Lodging.type` — choices: hotel, hostel, resort, bnb, campground, cabin, apartment, house, villa, motel
|
||||
- **Signal**: Most frequently used lodging type → travel style indicator (e.g. "hostel" → budget; "resort/villa" → luxury; "campground/cabin" → outdoor)
|
||||
- **Query**: `Lodging.objects.filter(user=user).values('type').annotate(cnt=Count('id')).order_by('-cnt')`
|
||||
- **Strength**: HIGH — directly maps to trip_style field
|
||||
|
||||
### 6. Lodging.rating (FloatField)
|
||||
- **Signal**: Combined with lodging type, identifies preferred accommodation standards
|
||||
- **Strength**: MEDIUM
|
||||
|
||||
### 7. Transportation.type (TRANSPORTATION_TYPES enum)
|
||||
- **Model**: `adventures/models.py:Transportation.type` — choices: car, plane, train, bus, boat, bike, walking
|
||||
- **Signal**: Primary transport mode → mobility preference (e.g. mostly walking/bike → slow travel; lots of planes → frequent flyer)
|
||||
- **Query**: `Transportation.objects.filter(user=user).values('type').annotate(cnt=Count('id')).order_by('-cnt')`
|
||||
- **Strength**: MEDIUM
|
||||
|
||||
### 8. Activity.sport_type (SPORT_TYPE_CHOICES)
|
||||
- **Model**: `adventures/models.py:Activity.sport_type` — 60+ choices mapped to 10 SPORT_CATEGORIES in `utils/sports_types.py`
|
||||
- **Signal**: Activity categories user is active in → physical/adventure interests
|
||||
- **Categories**: running, walking_hiking, cycling, water_sports, winter_sports, fitness_gym, racket_sports, climbing_adventure, team_sports
|
||||
- **Query**: Already aggregated in `stats_view.py:_get_activity_stats_by_category()` — uses `Activity.objects.filter(user=user).values('sport_type').annotate(count=Count('id'))`
|
||||
- **Strength**: HIGH — objective behavioral data from Strava/Wanderer imports
|
||||
|
||||
### 9. VisitedRegion / VisitedCity (worldtravel)
|
||||
- **Model**: `worldtravel/models.py` — `VisitedRegion(user, region)` and `VisitedCity(user, city)` with country/subregion
|
||||
- **Signal**: Countries/regions visited → geographic preferences (beach vs. mountain vs. city; EU vs. Asia etc.)
|
||||
- **Query**: `VisitedRegion.objects.filter(user=user).select_related('region__country')` → country distribution
|
||||
- **Strength**: MEDIUM-HIGH — "where has this user historically traveled?" informs destination type
|
||||
|
||||
### 10. Collection metadata
|
||||
- **Model**: `adventures/models.py:Collection` — name, description, start/end dates
|
||||
- **Signal**: Collection names/descriptions may contain destination/theme hints; trip duration (end_date − start_date) → travel pace; trip frequency (count, spacing) → travel cadence
|
||||
- **Query**: `Collection.objects.filter(user=user).values('name', 'description', 'start_date', 'end_date')`
|
||||
- **Strength**: LOW-MEDIUM (descriptions often blank; names are free-text)
|
||||
|
||||
### 11. Location.price / Lodging.price (MoneyField)
|
||||
- **Signal**: Average spend across locations/lodging → budget tier
|
||||
- **Query**: `Location.objects.filter(user=user).aggregate(avg_price=Avg('price'))` (requires djmoney amount field)
|
||||
- **Strength**: MEDIUM — but many records may have no price set
|
||||
|
||||
### 12. Location geographic clustering (lat/lon)
|
||||
- **Signal**: Country/region distribution of visited locations → geographic affinity
|
||||
- **Already tracked**: `Location.country`, `Location.region`, `Location.city` (FK, auto-geocoded)
|
||||
- **Query**: `Location.objects.filter(user=user).values('country__name').annotate(cnt=Count('id')).order_by('-cnt')`
|
||||
- **Strength**: HIGH
|
||||
|
||||
### 13. UserAchievement types
|
||||
- **Model**: `achievements/models.py:UserAchievement` — types: `adventure_count`, `country_count`
|
||||
- **Signal**: Milestone count → engagement level (casual vs. power user); high `country_count` → variety-seeker
|
||||
- **Strength**: LOW-MEDIUM (only 2 types currently)
|
||||
|
||||
### 14. ChatMessage content (user role)
|
||||
- **Model**: `chat/models.py:ChatMessage` — `role`, `content`
|
||||
- **Signal**: User messages in travel conversations → intent signals ("I love hiking", "looking for cheap food", "family-friendly")
|
||||
- **Query**: `ChatMessage.objects.filter(conversation__user=user, role='user').values_list('content', flat=True)`
|
||||
- **Strength**: MEDIUM — requires NLP; could be rich but noisy
|
||||
|
||||
## Aggregation Patterns Already in Codebase
|
||||
|
||||
| Pattern | Location | Reusability |
|
||||
|---|---|---|
|
||||
| Activity stats by category | `stats_view.py:_get_activity_stats_by_category()` | Direct reuse |
|
||||
| All-tags union | `tags_view.py:ActivityTypesView.types()` | Direct reuse |
|
||||
| VisitedRegion/City counts | `stats_view.py:counts()` | Direct reuse |
|
||||
| Multi-user preference merge | `llm_client.py:get_aggregated_preferences()` | Partial reuse |
|
||||
| Category-filtered location count | `serializers.py:location_count` | Pattern reference |
|
||||
| Location queryset scoping | `location_view.py:get_queryset()` | Standard pattern |
|
||||
|
||||
## Proposed Auto-Profile Fields from Signals
|
||||
|
||||
| Target Field | Primary Signals | Secondary Signals |
|
||||
|---|---|---|
|
||||
| `cuisines` | Location.tags (cuisine words), Location.category (dining) | Location.description NLP |
|
||||
| `interests` | Activity.sport_type categories, Location.category top-N | Location.tags frequency, VisitedRegion types |
|
||||
| `trip_style` | Lodging.type top (luxury/budget/outdoor), Transportation.type, Activity sport categories | Location.rating Avg, price signals |
|
||||
| `notes` | (not auto-derived — keep manual only) | — |
|
||||
|
||||
## Where to Implement
|
||||
|
||||
**New function target**: `integrations/views/recommendation_profile_view.py` or a new `integrations/utils/auto_profile.py`
|
||||
|
||||
**Suggested function signature**:
|
||||
```python
|
||||
def build_auto_preference_profile(user) -> dict:
|
||||
"""
|
||||
Returns {cuisines, interests, trip_style} inferred from user's travel history.
|
||||
Fields are non-destructive suggestions, not overrides of manual input.
|
||||
"""
|
||||
```
|
||||
|
||||
**New API endpoint target**: `POST /api/integrations/recommendation-preferences/auto-learn/`
|
||||
**ViewSet action**: `@action(detail=False, methods=['post'], url_path='auto-learn')` on `UserRecommendationPreferenceProfileViewSet`
|
||||
|
||||
## Integration Point
|
||||
`get_system_prompt()` in `chat/llm_client.py` already consumes `UserRecommendationPreferenceProfile` — auto-learned values
|
||||
flow directly into AI context with zero additional changes needed there.
|
||||
|
||||
See: [knowledge.md — User Recommendation Preference Profile](../knowledge.md#user-recommendation-preference-profile)
|
||||
See: [plans/ai-travel-agent-redesign.md — WS2](../plans/ai-travel-agent-redesign.md#ws2-user-preference-learning)
|
||||
35
.memory/research/litellm-zen-provider-catalog.md
Normal file
35
.memory/research/litellm-zen-provider-catalog.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# Research: LiteLLM provider catalog and OpenCode Zen support
|
||||
|
||||
Date: 2026-03-08
|
||||
Related plan: [AI travel agent in Collections Recommendations](../plans/ai-travel-agent-collections-integration.md)
|
||||
|
||||
## LiteLLM provider enumeration
|
||||
- Runtime provider list is available via `litellm.provider_list` and currently returns 128 provider IDs in this environment.
|
||||
- The enum source `LlmProviders` can be used for canonical provider identifiers.
|
||||
|
||||
## OpenCode Zen compatibility
|
||||
- OpenCode Zen is **not** a native LiteLLM provider alias.
|
||||
- Zen can be supported via LiteLLM's OpenAI-compatible routing using:
|
||||
- provider id in app: `opencode_zen`
|
||||
- model namespace: `openai/<zen-model>`
|
||||
- `api_base`: `https://opencode.ai/zen/v1`
|
||||
- No new SDK dependency required.
|
||||
|
||||
## Recommended backend contract
|
||||
- Add backend source-of-truth endpoint: `GET /api/chat/providers/`.
|
||||
- Response fields:
|
||||
- `id`
|
||||
- `label`
|
||||
- `available_for_chat`
|
||||
- `needs_api_key`
|
||||
- `default_model`
|
||||
- `api_base`
|
||||
- Return all LiteLLM runtime providers; mark non-mapped providers `available_for_chat=false` for display-only compliance.
|
||||
|
||||
## Data/storage compatibility notes
|
||||
- Existing `UserAPIKey(provider)` model supports adding `opencode_zen` without migration.
|
||||
- Consistent provider ID usage across serializer validation, key lookup, and chat request payload is required.
|
||||
|
||||
## Risks
|
||||
- Zen model names may evolve; keep default model configurable in backend mapping.
|
||||
- Full provider list is large; UI should communicate unavailable-for-chat providers clearly.
|
||||
303
.memory/research/opencode-zen-connection-debug.md
Normal file
303
.memory/research/opencode-zen-connection-debug.md
Normal file
@@ -0,0 +1,303 @@
|
||||
# OpenCode Zen Connection Debug — Research Findings
|
||||
|
||||
**Date**: 2026-03-08
|
||||
**Researchers**: researcher agent (root cause), explorer agent (code path trace)
|
||||
**Status**: Complete — root causes identified, fix proposed
|
||||
|
||||
## Summary
|
||||
|
||||
The OpenCode Zen provider configuration in `backend/server/chat/llm_client.py` has **two critical mismatches** that cause connection/API errors:
|
||||
|
||||
1. **Invalid model ID**: `gpt-4o-mini` does not exist on OpenCode Zen
|
||||
2. **Wrong endpoint for GPT models**: GPT models on Zen use `/responses` endpoint, not `/chat/completions`
|
||||
|
||||
An additional structural risk is that the backend runs under **Gunicorn WSGI** (not ASGI/uvicorn), but `stream_chat_completion` is an `async def` generator that is driven via `_async_to_sync_generator` which creates a new event loop per call. This works but causes every tool iteration to open/close an event loop, which is inefficient and fragile under load.
|
||||
|
||||
## End-to-End Request Path
|
||||
|
||||
### 1. Frontend: `AITravelChat.svelte` → `sendMessage()`
|
||||
- **File**: `frontend/src/lib/components/AITravelChat.svelte`, line 97
|
||||
- POST body: `{ message: <text>, provider: selectedProvider }` (e.g. `"opencode_zen"`)
|
||||
- Sends to: `POST /api/chat/conversations/<id>/send_message/`
|
||||
- On `fetch` network failure: shows `$t('chat.connection_error')` = `"Connection error. Please try again."` (line 191)
|
||||
- On HTTP error: tries `res.json()` → uses `err.error || $t('chat.connection_error')` (line 126)
|
||||
- On SSE `parsed.error`: shows `parsed.error` inline in the chat (line 158)
|
||||
- **Any exception from `litellm` is therefore masked as `"An error occurred while processing your request."` or `"Connection error. Please try again."`**
|
||||
|
||||
### 2. Proxy: `frontend/src/routes/api/[...path]/+server.ts` → `handleRequest()`
|
||||
- Strips and re-generates CSRF token (line 57-60)
|
||||
- POSTs to `http://server:8000/api/chat/conversations/<id>/send_message/`
|
||||
- Detects `content-type: text/event-stream` and streams body directly through (lines 94-98) — **no buffering**
|
||||
- On any fetch error: returns `{ error: 'Internal Server Error' }` (line 109)
|
||||
|
||||
### 3. Backend: `chat/views.py` → `ChatViewSet.send_message()`
|
||||
- Validates provider via `is_chat_provider_available()` (line 114) — passes for `opencode_zen`
|
||||
- Saves user message to DB (line 120)
|
||||
- Builds LLM messages list (line 131)
|
||||
- Wraps `async event_stream()` in `_async_to_sync_generator()` (line 269)
|
||||
- Returns `StreamingHttpResponse` with `text/event-stream` content type (line 268)
|
||||
|
||||
### 4. Backend: `chat/llm_client.py` → `stream_chat_completion()`
|
||||
- Normalizes provider (line 208)
|
||||
- Looks up `CHAT_PROVIDER_CONFIG["opencode_zen"]` (line 209)
|
||||
- Fetches API key from `UserAPIKey.objects.get(user=user, provider="opencode_zen")` (line 154)
|
||||
- Decrypts it via Fernet using `FIELD_ENCRYPTION_KEY` (line 102)
|
||||
- Calls `litellm.acompletion(model="openai/gpt-4o-mini", api_key=<key>, api_base="https://opencode.ai/zen/v1", stream=True, tools=AGENT_TOOLS, tool_choice="auto")` (line 237)
|
||||
- On **any exception**: logs and yields `data: {"error": "An error occurred..."}` (lines 274-276)
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
### #1 CRITICAL: Invalid default model `gpt-4o-mini`
|
||||
- **Location**: `backend/server/chat/llm_client.py:62`
|
||||
- `CHAT_PROVIDER_CONFIG["opencode_zen"]["default_model"] = "openai/gpt-4o-mini"`
|
||||
- `gpt-4o-mini` is an OpenAI-hosted model. The OpenCode Zen gateway at `https://opencode.ai/zen/v1` does not offer `gpt-4o-mini`.
|
||||
- LiteLLM sends: `POST https://opencode.ai/zen/v1/chat/completions` with `model: gpt-4o-mini`
|
||||
- Zen API returns HTTP 4xx (model not found or not available)
|
||||
- Exception is caught generically at line 274 → yields masked error SSE → frontend shows generic message
|
||||
|
||||
### #2 SIGNIFICANT: Generic exception handler masks real errors
|
||||
- **Location**: `backend/server/chat/llm_client.py:274-276`
|
||||
- Bare `except Exception:` with logger.exception and a generic user message
|
||||
- LiteLLM exceptions carry structured information: `litellm.exceptions.NotFoundError`, `AuthenticationError`, `BadRequestError`, etc.
|
||||
- All of these show up to the user as `"An error occurred while processing your request. Please try again."`
|
||||
- Prevents diagnosis without checking Docker logs
|
||||
|
||||
### #3 SIGNIFICANT: WSGI + async event loop per request
|
||||
- **Location**: `backend/server/chat/views.py:66-76` (`_async_to_sync_generator`)
|
||||
- Backend runs **Gunicorn WSGI** (from `supervisord.conf:11`); there is **no ASGI entry point** (`asgi.py` doesn't exist)
|
||||
- `stream_chat_completion` is `async def` using `litellm.acompletion` (awaited)
|
||||
- `_async_to_sync_generator` creates a fresh event loop via `asyncio.new_event_loop()` for each request
|
||||
- For multi-tool-iteration responses this loop drives multiple sequential `await` calls
|
||||
- This works but is fragile: if `litellm.acompletion` internally uses a singleton HTTP client that belongs to a different event loop, it will raise `RuntimeError: This event loop is already running` or connection errors on subsequent calls
|
||||
- **httpx/aiohttp sessions in LiteLLM may not be compatible with per-call new event loops**
|
||||
|
||||
### #4 MINOR: `tool_choice: "auto"` sent unconditionally with tools
|
||||
- **Location**: `backend/server/chat/llm_client.py:229`
|
||||
- `"tool_choice": "auto" if tools else None` — None values in kwargs are passed to litellm
|
||||
- Some OpenAI-compat endpoints (including potentially Zen models) reject `tool_choice: null` or unsupported parameters
|
||||
- Fix: remove key entirely instead of setting to None
|
||||
|
||||
### #5 MINOR: API key lookup is synchronous in async context
|
||||
- **Location**: `backend/server/chat/llm_client.py:217` and `views.py:144`
|
||||
- `get_llm_api_key` calls `UserAPIKey.objects.get(...)` synchronously
|
||||
- Called from within `async for chunk in stream_chat_completion(...)` in the async `event_stream()` generator
|
||||
- Django ORM operations must use `sync_to_async` in async contexts; direct sync ORM calls can cause `SynchronousOnlyOperation` errors or deadlocks under ASGI
|
||||
- Under WSGI+new-event-loop approach this is less likely to fail but is technically incorrect
|
||||
|
||||
## Recommended Fix (Ranked by Impact)
|
||||
|
||||
### Fix #1 (Primary): Correct the default model
|
||||
```python
|
||||
# backend/server/chat/llm_client.py:59-64
|
||||
"opencode_zen": {
|
||||
"label": "OpenCode Zen",
|
||||
"needs_api_key": True,
|
||||
"default_model": "openai/gpt-5-nano", # Free; confirmed to work via /chat/completions
|
||||
"api_base": "https://opencode.ai/zen/v1",
|
||||
},
|
||||
```
|
||||
Confirmed working models (use `/chat/completions`, OpenAI-compat):
|
||||
- `openai/gpt-5-nano` (free)
|
||||
- `openai/kimi-k2.5` (confirmed by GitHub usage)
|
||||
- `openai/glm-5` (GLM family)
|
||||
- `openai/big-pickle` (free)
|
||||
|
||||
GPT family models route through `/responses` endpoint on Zen, which LiteLLM's openai-compat mode does NOT use — only the above "OpenAI-compatible" models on Zen reliably work with LiteLLM's `openai/` prefix + `/chat/completions`.
|
||||
|
||||
### Fix #2 (Secondary): Structured error surfacing
|
||||
```python
|
||||
# backend/server/chat/llm_client.py:274-276
|
||||
except Exception as exc:
|
||||
logger.exception("LLM streaming error")
|
||||
# Extract structured detail if available
|
||||
status_code = getattr(exc, 'status_code', None)
|
||||
detail = getattr(exc, 'message', None) or str(exc)
|
||||
user_msg = f"Provider error ({status_code}): {detail}" if status_code else "An error occurred while processing your request. Please try again."
|
||||
yield f"data: {json.dumps({'error': user_msg})}\n\n"
|
||||
```
|
||||
|
||||
### Fix #3 (Minor): Remove None from tool_choice kwarg
|
||||
```python
|
||||
# backend/server/chat/llm_client.py:225-234
|
||||
completion_kwargs = {
|
||||
"model": provider_config["default_model"],
|
||||
"messages": messages,
|
||||
"stream": True,
|
||||
"api_key": api_key,
|
||||
}
|
||||
if tools:
|
||||
completion_kwargs["tools"] = tools
|
||||
completion_kwargs["tool_choice"] = "auto"
|
||||
if provider_config["api_base"]:
|
||||
completion_kwargs["api_base"] = provider_config["api_base"]
|
||||
```
|
||||
|
||||
## Error Flow Diagram
|
||||
|
||||
```
|
||||
User sends message (opencode_zen)
|
||||
→ AITravelChat.svelte:sendMessage()
|
||||
→ POST /api/chat/conversations/<id>/send_message/
|
||||
→ +server.ts:handleRequest() [proxy, no mutation]
|
||||
→ POST http://server:8000/api/chat/conversations/<id>/send_message/
|
||||
→ views.py:ChatViewSet.send_message()
|
||||
→ llm_client.py:stream_chat_completion()
|
||||
→ litellm.acompletion(model="openai/gpt-4o-mini", ← FAILS HERE
|
||||
api_base="https://opencode.ai/zen/v1")
|
||||
→ except Exception → yield data:{"error":"An error occurred..."}
|
||||
← SSE: data:{"error":"An error occurred..."}
|
||||
← StreamingHttpResponse(text/event-stream)
|
||||
← streamed through
|
||||
← streamed through
|
||||
← reader.read() → parsed.error set
|
||||
← assistantMsg.content = "An error occurred..." ← shown to user
|
||||
```
|
||||
|
||||
If the network/DNS fails entirely (e.g. `https://opencode.ai` unreachable):
|
||||
```
|
||||
→ litellm.acompletion raises immediately
|
||||
→ except Exception → yield data:{"error":"An error occurred..."}
|
||||
— OR —
|
||||
→ +server.ts fetch fails → json({error:"Internal Server Error"}, 500)
|
||||
→ AITravelChat.svelte res.ok is false → res.json() → err.error || $t('chat.connection_error')
|
||||
→ shows "Connection error. Please try again."
|
||||
```
|
||||
|
||||
## File References
|
||||
|
||||
| File | Line(s) | Relevance |
|
||||
|---|---|---|
|
||||
| `backend/server/chat/llm_client.py` | 59-64 | `CHAT_PROVIDER_CONFIG["opencode_zen"]` — primary fix |
|
||||
| `backend/server/chat/llm_client.py` | 150-157 | `get_llm_api_key()` — DB lookup for stored key |
|
||||
| `backend/server/chat/llm_client.py` | 203-276 | `stream_chat_completion()` — full LiteLLM call + error handler |
|
||||
| `backend/server/chat/llm_client.py` | 225-234 | `completion_kwargs` construction |
|
||||
| `backend/server/chat/llm_client.py` | 274-276 | Generic `except Exception` (swallows all errors) |
|
||||
| `backend/server/chat/views.py` | 103-274 | `send_message()` — SSE pipeline orchestration |
|
||||
| `backend/server/chat/views.py` | 66-76 | `_async_to_sync_generator()` — WSGI/async bridge |
|
||||
| `backend/server/integrations/models.py` | 78-112 | `UserAPIKey` — encrypted key storage |
|
||||
| `frontend/src/lib/components/AITravelChat.svelte` | 97-195 | `sendMessage()` — SSE consumer + error display |
|
||||
| `frontend/src/lib/components/AITravelChat.svelte` | 124-129 | HTTP error → `$t('chat.connection_error')` |
|
||||
| `frontend/src/lib/components/AITravelChat.svelte` | 157-160 | SSE `parsed.error` → inline display |
|
||||
| `frontend/src/lib/components/AITravelChat.svelte` | 190-192 | Outer catch → `$t('chat.connection_error')` |
|
||||
| `frontend/src/routes/api/[...path]/+server.ts` | 34-110 | `handleRequest()` — proxy |
|
||||
| `frontend/src/routes/api/[...path]/+server.ts` | 94-98 | SSE passthrough (no mutation) |
|
||||
| `frontend/src/locales/en.json` | 46 | `chat.connection_error` = "Connection error. Please try again." |
|
||||
| `backend/supervisord.conf` | 11 | Gunicorn WSGI startup (no ASGI) |
|
||||
|
||||
---
|
||||
|
||||
## Model Selection Implementation Map
|
||||
|
||||
**Date**: 2026-03-08
|
||||
|
||||
### Frontend Provider/Model Selection State (Current)
|
||||
|
||||
In `AITravelChat.svelte`:
|
||||
- `selectedProvider` (line 29): `let selectedProvider = 'openai'` — bare string, no model tracking
|
||||
- `providerCatalog` (line 30): `ChatProviderCatalogEntry[]` — already contains `default_model: string | null` per entry
|
||||
- `chatProviders` (line 31): reactive filtered view of `providerCatalog` (available only)
|
||||
- `loadProviderCatalog()` (line 37): populates catalog from `GET /api/chat/providers/`
|
||||
- `sendMessage()` (line 97): POST body at line 121 is `{ message: msgText, provider: selectedProvider }` — **no model field**
|
||||
- Provider `<select>` (lines 290–298): in the top toolbar of the chat panel
|
||||
|
||||
### Request Payload Build Point
|
||||
|
||||
`AITravelChat.svelte`, line 118–122:
|
||||
```ts
|
||||
const res = await fetch(`/api/chat/conversations/${conversation.id}/send_message/`, {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({ message: msgText, provider: selectedProvider }) // ← ADD model here
|
||||
});
|
||||
```
|
||||
|
||||
### Backend Request Intake Point
|
||||
|
||||
`chat/views.py`, `send_message()` (line 104):
|
||||
- Line 113: `provider = (request.data.get("provider") or "openai").strip().lower()`
|
||||
- Line 144: `stream_chat_completion(request.user, current_messages, provider, tools=AGENT_TOOLS)`
|
||||
- **No model extraction**; model comes only from `CHAT_PROVIDER_CONFIG[provider]["default_model"]`
|
||||
|
||||
### Backend Model Usage Point
|
||||
|
||||
`chat/llm_client.py`, `stream_chat_completion()` (line 203):
|
||||
- Line 225–226: `completion_kwargs = { "model": provider_config["default_model"], ... }`
|
||||
- This is the **sole place model is resolved** — no override capability exists yet
|
||||
|
||||
### Persistence Options Analysis
|
||||
|
||||
| Option | Files changed | Migration? | Risk |
|
||||
|---|---|---|---|
|
||||
| **`localStorage` (recommended)** | `AITravelChat.svelte` only for persistence | No | Lowest: no backend, no schema |
|
||||
| `CustomUser` field (`chat_model_prefs` JSONField) | `users/models.py`, `users/serializers.py`, `users/views.py`, migration | **Yes** | Medium: schema change, serializer exposure |
|
||||
| `UserAPIKey`-style new model prefs table | new `chat/models.py` + serializer + view + urls + migration | **Yes** | High: new endpoint, multi-file |
|
||||
| `UserRecommendationPreferenceProfile` JSONField addition | `integrations/models.py`, serializer, migration | **Yes** | Medium: migration on integrations app |
|
||||
|
||||
**Selected**: `localStorage` — key `voyage_chat_model_prefs`, value `Record<provider_id, model_string>`.
|
||||
|
||||
### File-by-File Edit Plan
|
||||
|
||||
#### 1. `backend/server/chat/llm_client.py`
|
||||
| Symbol | Change |
|
||||
|---|---|
|
||||
| `stream_chat_completion(user, messages, provider, tools=None)` | Add `model: str \| None = None` parameter |
|
||||
| `completion_kwargs["model"]` (line 226) | Change to `model or provider_config["default_model"]` |
|
||||
| (new) validation | If `model` provided: assert it starts with expected LiteLLM prefix or raise SSE error |
|
||||
|
||||
#### 2. `backend/server/chat/views.py`
|
||||
| Symbol | Change |
|
||||
|---|---|
|
||||
| `send_message()` (line 104) | Extract `model = (request.data.get("model") or "").strip() or None` |
|
||||
| `stream_chat_completion(...)` call (line 144) | Pass `model=model` |
|
||||
| (optional validation) | Return 400 if model prefix doesn't match provider |
|
||||
|
||||
#### 3. `frontend/src/lib/components/AITravelChat.svelte`
|
||||
| Symbol | Change |
|
||||
|---|---|
|
||||
| (new) `let selectedModel: string` | Initialize from `loadModelPref(selectedProvider)` or `default_model` |
|
||||
| (new) `$: selectedProviderEntry` | Reactive lookup of current provider's catalog entry |
|
||||
| (new) `$: selectedModel` reset | Reset on provider change; persist with `saveModelPref` |
|
||||
| `sendMessage()` body (line 121) | Add `model: selectedModel || undefined` to JSON body |
|
||||
| (new) model `<input>` in toolbar | Placed after provider `<select>`, `bind:value={selectedModel}`, placeholder = `default_model` |
|
||||
| (new) `loadModelPref(provider)` | Read from `localStorage.getItem('voyage_chat_model_prefs')` |
|
||||
| (new) `saveModelPref(provider, model)` | Write to `localStorage.setItem('voyage_chat_model_prefs', ...)` |
|
||||
|
||||
#### 4. `frontend/src/locales/en.json`
|
||||
| Key | Value |
|
||||
|---|---|
|
||||
| `chat.model_label` | `"Model"` |
|
||||
| `chat.model_placeholder` | `"Default model"` |
|
||||
|
||||
### Provider-Model Compatibility Validation
|
||||
|
||||
The critical constraint is **LiteLLM model-string routing**. LiteLLM uses the `provider/model-name` prefix to determine which SDK client to use:
|
||||
- `openai/gpt-5-nano` → OpenAI client (with custom `api_base` for Zen)
|
||||
- `anthropic/claude-sonnet-4-20250514` → Anthropic client
|
||||
- `groq/llama-3.3-70b-versatile` → Groq client
|
||||
|
||||
If user types `anthropic/claude-opus` for `openai` provider, LiteLLM uses Anthropic SDK with OpenAI credentials → guaranteed failure.
|
||||
|
||||
**Recommended backend guard** in `send_message()`:
|
||||
```python
|
||||
if model:
|
||||
expected_prefix = provider_config["default_model"].split("/")[0]
|
||||
if not model.startswith(expected_prefix + "/"):
|
||||
return Response(
|
||||
{"error": f"Model must use '{expected_prefix}/' prefix for provider '{provider}'."},
|
||||
status=status.HTTP_400_BAD_REQUEST,
|
||||
)
|
||||
```
|
||||
|
||||
Exception: `opencode_zen` and `openrouter` accept any prefix (they're routing gateways). Guard should skip prefix check when `api_base` is set (custom gateway).
|
||||
|
||||
### Migration Requirement
|
||||
|
||||
**NO migration required** for the recommended localStorage approach.
|
||||
|
||||
---
|
||||
|
||||
## Cross-references
|
||||
|
||||
- See [Plan: OpenCode Zen connection error](../plans/opencode-zen-connection-error.md)
|
||||
- See [Research: LiteLLM provider catalog](litellm-zen-provider-catalog.md)
|
||||
- See [Knowledge: AI Chat](../knowledge.md#ai-chat-collections--recommendations)
|
||||
198
.memory/research/provider-strategy.md
Normal file
198
.memory/research/provider-strategy.md
Normal file
@@ -0,0 +1,198 @@
|
||||
# Research: Multi-Provider Strategy for Voyage AI Chat
|
||||
|
||||
**Date**: 2026-03-09
|
||||
**Researcher**: researcher agent
|
||||
**Status**: Complete
|
||||
|
||||
## Summary
|
||||
|
||||
Investigated how OpenCode, OpenClaw-like projects, and LiteLLM-based production systems handle multi-provider model discovery, auth, rate-limit resilience, and tool-calling compatibility. Assessed whether replacing LiteLLM is warranted for Voyage.
|
||||
|
||||
**Bottom line**: Keep LiteLLM, harden it. Replacing LiteLLM would be a multi-week migration with negligible user-facing benefit. LiteLLM already solves the hard problems (100+ provider SDKs, streaming, tool-call translation). Voyage's issues are in the **integration layer**, not in LiteLLM itself.
|
||||
|
||||
---
|
||||
|
||||
## 1. Pattern Analysis: How Projects Handle Multi-Provider
|
||||
|
||||
### 1a. Dynamic Model Discovery
|
||||
|
||||
| Project | Approach | Notes |
|
||||
|---|---|---|
|
||||
| **OpenCode** | Static registry from `models.dev` (JSON database), merged with user config, filtered by env/auth presence | No runtime API calls to providers for discovery; curated model metadata (capabilities, cost, limits) baked in |
|
||||
| **Ragflow** | Hardcoded `SupportedLiteLLMProvider` enum + per-provider model lists | Similar to Voyage's current approach |
|
||||
| **daily_stock_analysis** | `litellm.Router` model_list config + `fallback_models` list from config file | Runtime fallback, not runtime discovery |
|
||||
| **Onyx** | `LLMProvider` DB model + admin UI for model configuration | DB-backed, admin-managed |
|
||||
| **LiteLLM Proxy** | YAML config `model_list` with deployment-level params | Static config, hot-reloadable |
|
||||
| **Voyage (current)** | `CHAT_PROVIDER_CONFIG` dict + hardcoded `models()` per provider + OpenAI API `client.models.list()` for OpenAI only | Mixed: one provider does live discovery, rest are hardcoded |
|
||||
|
||||
**Key insight**: No production project does universal runtime model discovery across all providers. OpenCode — the most sophisticated — uses a curated static database (`models.dev`) with provider/model metadata including capability flags (`toolcall`, `reasoning`, `streaming`). This is the right pattern for Voyage.
|
||||
|
||||
### 1b. Provider Auth Handling
|
||||
|
||||
| Project | Approach |
|
||||
|---|---|
|
||||
| **OpenCode** | Multi-source: env vars → `Auth.get()` (stored credentials) → config file → plugin loaders; per-provider custom auth (AWS chains, Google ADC, OAuth) |
|
||||
| **LiteLLM Router** | `api_key` per deployment in model_list; env var fallback |
|
||||
| **Cognee** | Rate limiter context manager wrapping LiteLLM calls |
|
||||
| **Voyage (current)** | Per-user encrypted `UserAPIKey` DB model + instance-level `VOYAGE_AI_API_KEY` env fallback; key fetched per-request |
|
||||
|
||||
**Voyage's approach is sound.** Per-user DB-stored keys with instance fallback matches the self-hosted deployment model. No change needed.
|
||||
|
||||
### 1c. Rate-Limit Fallback / Retry
|
||||
|
||||
| Project | Approach |
|
||||
|---|---|
|
||||
| **LiteLLM Router** | Built-in: `num_retries`, `fallbacks` (cross-model), `allowed_fails` + `cooldown_time`, `RetryPolicy` (per-exception-type retry counts), `AllowedFailsPolicy` |
|
||||
| **daily_stock_analysis** | `litellm.Router` with `fallback_models` list + multi-key support (rotate API keys on rate limit) |
|
||||
| **Cognee** | `tenacity` retry decorator with `wait_exponential_jitter` + LiteLLM rate limiter |
|
||||
| **Suna** | LiteLLM exception mapping → structured error processor |
|
||||
| **Voyage (current)** | Zero retries. Single attempt. `_safe_error_payload()` maps exceptions to user messages but does not retry. |
|
||||
|
||||
**This is Voyage's biggest gap.** Every other production system has retry logic. LiteLLM has this built in — Voyage just isn't using it.
|
||||
|
||||
### 1d. Tool-Calling Compatibility
|
||||
|
||||
| Project | Approach |
|
||||
|---|---|
|
||||
| **OpenCode** | `capabilities.toolcall` boolean per model in `models.dev` database; models without tool support are filtered from agentic use |
|
||||
| **LiteLLM** | `litellm.supports_function_calling(model=)` runtime check; `get_supported_openai_params(model=)` for param filtering |
|
||||
| **PraisonAI** | `litellm.supports_function_calling()` guard before tool dispatch |
|
||||
| **open-interpreter** | Same `litellm.supports_function_calling()` guard |
|
||||
| **Voyage (current)** | No tool-call capability check. `AGENT_TOOLS` always passed. Reasoning models excluded from `opencode_zen` list by critic gate (manual). |
|
||||
|
||||
**Actionable gap.** `litellm.supports_function_calling(model=)` exists and should be used before passing `tools` kwarg.
|
||||
|
||||
---
|
||||
|
||||
## 2. Architecture Options Comparison
|
||||
|
||||
| Option | Description | Effort | Risk | Benefit |
|
||||
|---|---|---|---|---|
|
||||
| **A. Keep LiteLLM, harden** | Add Router for retry/fallback, add `supports_function_calling` guard, curate model lists with capability metadata | **Low** (1-2 sessions) | **Low** — incremental changes to existing working code | Retry resilience, tool-call safety, zero migration |
|
||||
| **B. Hybrid: direct SDK for some** | Use `@ai-sdk/*` packages (like OpenCode) for primary providers, LiteLLM for others | **High** (1-2 weeks) | **High** — new TS→Python SDK mismatch, dual streaming paths, test surface explosion | Finer control per provider; no real benefit for Django backend |
|
||||
| **C. Replace LiteLLM entirely** | Build custom provider abstraction or adopt Vercel AI SDK (TypeScript-only) | **Very High** (3-4 weeks) | **Very High** — rewrite streaming, tool-call translation, error mapping for each provider | Only makes sense if moving to full-stack TypeScript |
|
||||
| **D. LiteLLM Proxy (sidecar)** | Run LiteLLM as a separate proxy service, call it via OpenAI-compatible API | **Medium** (2-3 days) | **Medium** — new Docker service, config management, latency overhead | Centralized config, built-in admin UI, but overkill for single-user self-hosted |
|
||||
|
||||
---
|
||||
|
||||
## 3. Recommendation
|
||||
|
||||
### Immediate (this session / next session): Option A — Harden LiteLLM
|
||||
|
||||
**Specific code-level adaptations:**
|
||||
|
||||
#### 3a. Add `litellm.Router` for retry + fallback (highest impact)
|
||||
|
||||
Replace bare `litellm.acompletion()` with `litellm.Router.acompletion()`:
|
||||
|
||||
```python
|
||||
# llm_client.py — new module-level router
|
||||
import litellm
|
||||
from litellm.router import RetryPolicy
|
||||
|
||||
_router = None
|
||||
|
||||
def _get_router():
|
||||
global _router
|
||||
if _router is None:
|
||||
_router = litellm.Router(
|
||||
model_list=[], # empty — we use router for retry/timeout only
|
||||
num_retries=2,
|
||||
timeout=60,
|
||||
retry_policy=RetryPolicy(
|
||||
AuthenticationErrorRetries=0,
|
||||
RateLimitErrorRetries=2,
|
||||
TimeoutErrorRetries=1,
|
||||
BadRequestErrorRetries=0,
|
||||
),
|
||||
)
|
||||
return _router
|
||||
```
|
||||
|
||||
**However**: LiteLLM Router requires models pre-registered in `model_list`. For Voyage's dynamic per-user-key model, the simpler approach is:
|
||||
|
||||
```python
|
||||
# In stream_chat_completion, add retry params to acompletion:
|
||||
response = await litellm.acompletion(
|
||||
**completion_kwargs,
|
||||
num_retries=2,
|
||||
request_timeout=60,
|
||||
)
|
||||
```
|
||||
|
||||
LiteLLM's `acompletion()` accepts `num_retries` directly — no Router needed.
|
||||
|
||||
**Files**: `backend/server/chat/llm_client.py` line 418 (add `num_retries=2, request_timeout=60`)
|
||||
|
||||
#### 3b. Add tool-call capability guard
|
||||
|
||||
```python
|
||||
# In stream_chat_completion, before building completion_kwargs:
|
||||
effective_model = model or provider_config["default_model"]
|
||||
if tools and not litellm.supports_function_calling(model=effective_model):
|
||||
# Strip tools — model doesn't support them
|
||||
tools = None
|
||||
logger.warning("Model %s does not support function calling; tools stripped", effective_model)
|
||||
```
|
||||
|
||||
**Files**: `backend/server/chat/llm_client.py` around line 397
|
||||
|
||||
#### 3c. Curate model lists with tool-call metadata in `models()` endpoint
|
||||
|
||||
Instead of returning bare string lists, return objects with capability info:
|
||||
|
||||
```python
|
||||
# In ChatProviderCatalogViewSet.models():
|
||||
if provider in ["opencode_zen"]:
|
||||
return Response({"models": [
|
||||
{"id": "openai/gpt-5-nano", "supports_tools": True},
|
||||
{"id": "openai/gpt-4o-mini", "supports_tools": True},
|
||||
{"id": "openai/gpt-4o", "supports_tools": True},
|
||||
{"id": "anthropic/claude-sonnet-4-20250514", "supports_tools": True},
|
||||
{"id": "anthropic/claude-3-5-haiku-20241022", "supports_tools": True},
|
||||
]})
|
||||
```
|
||||
|
||||
**Files**: `backend/server/chat/views/__init__.py` — `models()` action. Frontend `loadModelsForProvider()` would need minor update to handle objects.
|
||||
|
||||
#### 3d. Fix `day_suggestions.py` hardcoded model
|
||||
|
||||
Line 194 uses `model="gpt-4o-mini"` — doesn't respect provider config or user selection:
|
||||
|
||||
```python
|
||||
# day_suggestions.py line 193-194
|
||||
response = litellm.completion(
|
||||
model="gpt-4o-mini", # BUG: ignores provider config
|
||||
```
|
||||
|
||||
Should use provider_config default or user-selected model.
|
||||
|
||||
**Files**: `backend/server/chat/views/day_suggestions.py` line 194
|
||||
|
||||
### Long-term (future sessions)
|
||||
|
||||
1. **Adopt `models.dev`-style curated database**: OpenCode's approach of maintaining a JSON/YAML model registry with capabilities, costs, and limits is superior to hardcoded lists. Could be a YAML file in `backend/server/chat/models.yaml` loaded at startup.
|
||||
|
||||
2. **LiteLLM Proxy sidecar**: If Voyage gains multi-user production deployment, running LiteLLM as a proxy sidecar gives centralized rate limiting, key management, and an admin dashboard. Not warranted for current self-hosted single/few-user deployment.
|
||||
|
||||
3. **WSGI→ASGI migration**: Already documented as out-of-scope, but remains the long-term fix for event loop fragility (see [opencode-zen-connection-debug.md](opencode-zen-connection-debug.md#3-significant-wsgi--async-event-loop-per-request)).
|
||||
|
||||
---
|
||||
|
||||
## 4. Why NOT Replace LiteLLM
|
||||
|
||||
| Concern | Reality |
|
||||
|---|---|
|
||||
| "LiteLLM is too heavy" | It's a pip dependency (~40MB installed). No runtime sidecar. Same weight as Django itself. |
|
||||
| "We could use provider SDKs directly" | Each provider has different streaming formats, tool-call schemas, and error types. LiteLLM normalizes all of this. Reimplementing costs weeks per provider. |
|
||||
| "OpenCode doesn't use LiteLLM" | OpenCode is TypeScript + Vercel AI SDK. It has ~20 bundled `@ai-sdk/*` provider packages. The Python equivalent IS LiteLLM. |
|
||||
| "LiteLLM has bugs" | All Voyage's issues are in our integration layer (no retries, no capability checks, hardcoded models), not in LiteLLM itself. |
|
||||
|
||||
---
|
||||
|
||||
## Cross-references
|
||||
|
||||
- See [Research: LiteLLM provider catalog](litellm-zen-provider-catalog.md)
|
||||
- See [Research: OpenCode Zen connection debug](opencode-zen-connection-debug.md)
|
||||
- See [Plan: Travel agent context + models](../plans/travel-agent-context-and-models.md)
|
||||
- See [Decisions: Critic Gate](../decisions.md#critic-gate-travel-agent-context--models-follow-up)
|
||||
21
.memory/sessions/continuity.md
Normal file
21
.memory/sessions/continuity.md
Normal file
@@ -0,0 +1,21 @@
|
||||
# Session Continuity
|
||||
|
||||
## Last Session (2026-03-09)
|
||||
- Completed `chat-provider-fixes` change set with three workstreams:
|
||||
- `chat-loop-hardening`: invalid required-arg tool calls now terminate cleanly, not replayed, assistant tool_call history trimmed consistently
|
||||
- `default-ai-settings`: Settings page saves default provider/model via `UserAISettings`; DB defaults authoritative over localStorage; backend fallback uses saved defaults
|
||||
- `suggestion-add-flow`: day suggestions use resolved provider/model (not hardcoded OpenAI); modal normalizes suggestion payloads for add-to-itinerary
|
||||
- All three workstreams passed reviewer + tester validation
|
||||
- Documentation updated for all three workstreams
|
||||
|
||||
## Active Work
|
||||
- `chat-provider-fixes` plan complete — all workstreams implemented, reviewed, tested, documented
|
||||
- See [plans/](../plans/) for other active feature plans
|
||||
- Pre-release policy established — architecture-level changes allowed (see AGENTS.md)
|
||||
|
||||
## Known Follow-up Items (from tester findings)
|
||||
- No automated test coverage for `UserAISettings` CRUD + precedence logic
|
||||
- No automated test coverage for `send_message` streaming loop (tool error short-circuit, multi-tool partial success, `MAX_TOOL_ITERATIONS`)
|
||||
- No automated test coverage for `DaySuggestionsView.post()`
|
||||
- `get_weather` error `"dates must be a non-empty list"` does not trigger tool-error short-circuit (mitigated by `MAX_TOOL_ITERATIONS`)
|
||||
- LLM-generated name/location fields not truncated to `max_length=200` before `LocationSerializer` (low risk)
|
||||
1
.memory/system.md
Normal file
1
.memory/system.md
Normal file
@@ -0,0 +1 @@
|
||||
Voyage is a self-hosted travel companion web app (fork of AdventureLog) built with SvelteKit 2 (TypeScript) frontend, Django REST Framework (Python) backend, PostgreSQL/PostGIS, Memcached, and Docker. It provides trip planning with collections/itineraries, AI-powered travel chat with multi-provider LLM support (via LiteLLM), location/lodging/transportation management, user preference learning, and collaborative trip sharing. The project is pre-release — architecture-level changes are allowed. See [knowledge/overview.md](knowledge/overview.md) for architecture and [decisions.md](decisions.md) for ADRs.
|
||||
@@ -13,7 +13,7 @@ Voyage is **pre-release** — not yet in production use. During pre-release:
|
||||
|
||||
## Architecture Overview
|
||||
- **API proxy pattern**: Frontend never calls Django directly. All API calls go through `frontend/src/routes/api/[...path]/+server.ts`, which proxies to `http://server:8000`, handles cookies, and injects CSRF behavior.
|
||||
- **AI chat**: Embedded in Collections → Recommendations via `AITravelChat.svelte` component. No standalone `/chat` route. Provider list is dynamic from backend `GET /api/chat/providers/` (sourced from LiteLLM runtime + custom entries like `opencode_zen`). Chat conversations use SSE streaming via `/api/chat/conversations/`. Chat composer supports per-provider model override (persisted in browser `localStorage`). LiteLLM errors are mapped to sanitized user-safe messages via `_safe_error_payload()` (never exposes raw exception text).
|
||||
- **AI chat**: Embedded in Collections → Recommendations via `AITravelChat.svelte` component. No standalone `/chat` route. Provider list is dynamic from backend `GET /api/chat/providers/` (sourced from LiteLLM runtime + custom entries like `opencode_zen`). Chat conversations use SSE streaming via `/api/chat/conversations/`. Default AI provider/model saved via `UserAISettings` in DB (authoritative over browser localStorage). LiteLLM errors are mapped to sanitized user-safe messages via `_safe_error_payload()` (never exposes raw exception text). Invalid tool calls (missing required args) are detected and short-circuited with a user-visible error — not replayed into history.
|
||||
- **Service ports**:
|
||||
- `web` → `:8015`
|
||||
- `server` → `:8016`
|
||||
@@ -69,6 +69,7 @@ Run in this order:
|
||||
- Chat model override: dropdown selector fed by `GET /api/chat/providers/{provider}/models/`; persisted in `localStorage` key `voyage_chat_model_prefs`; backend accepts optional `model` param in `send_message`
|
||||
- Chat context: collection chats inject multi-stop itinerary context; system prompt guides `get_trip_details`-first reasoning
|
||||
- Chat error surfacing: `_safe_error_payload()` maps LiteLLM exceptions to sanitized user-safe categories (never forwards raw `exc.message`)
|
||||
- Invalid tool calls (missing required args) are detected and short-circuited with a user-visible error — not replayed into history
|
||||
|
||||
## Conventions
|
||||
- Do **not** attempt to fix known test/configuration issues as part of feature work.
|
||||
|
||||
@@ -15,7 +15,7 @@ Voyage is **pre-release** — not yet in production use. During pre-release:
|
||||
- Use the API proxy pattern: never call Django directly from frontend components.
|
||||
- Route all frontend API calls through `frontend/src/routes/api/[...path]/+server.ts`.
|
||||
- Proxy target is `http://server:8000`; preserve session cookies and CSRF behavior.
|
||||
- AI chat is embedded in Collections → Recommendations via `AITravelChat.svelte`. There is no standalone `/chat` route. Chat providers are loaded dynamically from `GET /api/chat/providers/` (backed by LiteLLM runtime providers + custom entries like `opencode_zen`). Chat conversations stream via SSE through `/api/chat/conversations/`. Chat composer supports per-provider model override (persisted in browser `localStorage`). LiteLLM errors are mapped to sanitized user-safe messages via `_safe_error_payload()` (never exposes raw exception text).
|
||||
- AI chat is embedded in Collections → Recommendations via `AITravelChat.svelte`. There is no standalone `/chat` route. Chat providers are loaded dynamically from `GET /api/chat/providers/` (backed by LiteLLM runtime providers + custom entries like `opencode_zen`). Chat conversations stream via SSE through `/api/chat/conversations/`. Default AI provider/model saved via `UserAISettings` in DB (authoritative over browser localStorage). LiteLLM errors are mapped to sanitized user-safe messages via `_safe_error_payload()` (never exposes raw exception text). Invalid tool calls (missing required args) are detected and short-circuited with a user-visible error — not replayed into history.
|
||||
- Service ports:
|
||||
- `web` → `:8015`
|
||||
- `server` → `:8016`
|
||||
@@ -77,6 +77,7 @@ Run in this exact order:
|
||||
- Chat model override: dropdown selector fed by `GET /api/chat/providers/{provider}/models/`; persisted in `localStorage` key `voyage_chat_model_prefs`; backend accepts optional `model` param in `send_message`
|
||||
- Chat context: collection chats inject multi-stop itinerary context; system prompt guides `get_trip_details`-first reasoning
|
||||
- Chat error surfacing: `_safe_error_payload()` maps LiteLLM exceptions to sanitized user-safe categories (never forwards raw `exc.message`)
|
||||
- Invalid tool calls (missing required args) are detected and short-circuited with a user-visible error — not replayed into history
|
||||
|
||||
## Conventions
|
||||
- Do **not** attempt to fix known test/configuration issues as part of feature work.
|
||||
|
||||
@@ -98,7 +98,11 @@ def _parse_address(tags):
|
||||
|
||||
@agent_tool(
|
||||
name="search_places",
|
||||
description="Search for places of interest near a location. Returns tourist attractions, restaurants, hotels, etc.",
|
||||
description=(
|
||||
"Search for places of interest near a location. "
|
||||
"Required: provide a non-empty 'location' string (city, neighborhood, or address). "
|
||||
"Returns tourist attractions, restaurants, hotels, etc."
|
||||
),
|
||||
parameters={
|
||||
"location": {
|
||||
"type": "string",
|
||||
@@ -231,7 +235,11 @@ def list_trips(user):
|
||||
|
||||
@agent_tool(
|
||||
name="web_search",
|
||||
description="Search the web for current information about destinations, events, prices, weather, or any real-time travel information. Use this when you need up-to-date information that may not be in your training data.",
|
||||
description=(
|
||||
"Search the web for current travel information. "
|
||||
"Required: provide a non-empty 'query' string describing exactly what to look up. "
|
||||
"Use when you need up-to-date info that may not be in training data."
|
||||
),
|
||||
parameters={
|
||||
"query": {
|
||||
"type": "string",
|
||||
|
||||
@@ -165,6 +165,18 @@ def _normalize_provider_id(provider_id):
|
||||
return lowered
|
||||
|
||||
|
||||
def normalize_gateway_model(provider_id, model):
|
||||
normalized_provider = _normalize_provider_id(provider_id)
|
||||
normalized_model = str(model or "").strip()
|
||||
if not normalized_model:
|
||||
return None
|
||||
|
||||
if normalized_provider == "opencode_zen" and "/" not in normalized_model:
|
||||
return f"openai/{normalized_model}"
|
||||
|
||||
return normalized_model
|
||||
|
||||
|
||||
def _default_provider_label(provider_id):
|
||||
return provider_id.replace("_", " ").title()
|
||||
|
||||
@@ -405,6 +417,7 @@ async def stream_chat_completion(user, messages, provider, tools=None, model=Non
|
||||
)
|
||||
or provider_config["default_model"]
|
||||
)
|
||||
resolved_model = normalize_gateway_model(normalized_provider, resolved_model)
|
||||
|
||||
if tools and not litellm.supports_function_calling(model=resolved_model):
|
||||
logger.warning(
|
||||
|
||||
@@ -1,10 +1,12 @@
|
||||
import asyncio
|
||||
import json
|
||||
import logging
|
||||
import re
|
||||
|
||||
from asgiref.sync import sync_to_async
|
||||
from adventures.models import Collection
|
||||
from django.http import StreamingHttpResponse
|
||||
from integrations.models import UserAISettings
|
||||
from rest_framework import status, viewsets
|
||||
from rest_framework.decorators import action
|
||||
from rest_framework.permissions import IsAuthenticated
|
||||
@@ -53,19 +55,40 @@ class ChatViewSet(viewsets.ModelViewSet):
|
||||
return Response(serializer.data, status=status.HTTP_201_CREATED)
|
||||
|
||||
def _build_llm_messages(self, conversation, user, system_prompt=None):
|
||||
ordered_messages = list(conversation.messages.all().order_by("created_at"))
|
||||
valid_tool_call_ids = {
|
||||
message.tool_call_id
|
||||
for message in ordered_messages
|
||||
if message.role == "tool"
|
||||
and message.tool_call_id
|
||||
and not self._is_required_param_tool_error_message_content(message.content)
|
||||
}
|
||||
|
||||
messages = [
|
||||
{
|
||||
"role": "system",
|
||||
"content": system_prompt or get_system_prompt(user),
|
||||
}
|
||||
]
|
||||
for message in conversation.messages.all().order_by("created_at"):
|
||||
for message in ordered_messages:
|
||||
if (
|
||||
message.role == "tool"
|
||||
and self._is_required_param_tool_error_message_content(message.content)
|
||||
):
|
||||
continue
|
||||
|
||||
payload = {
|
||||
"role": message.role,
|
||||
"content": message.content,
|
||||
}
|
||||
if message.role == "assistant" and message.tool_calls:
|
||||
payload["tool_calls"] = message.tool_calls
|
||||
filtered_tool_calls = [
|
||||
tool_call
|
||||
for tool_call in message.tool_calls
|
||||
if (tool_call or {}).get("id") in valid_tool_call_ids
|
||||
]
|
||||
if filtered_tool_calls:
|
||||
payload["tool_calls"] = filtered_tool_calls
|
||||
if message.role == "tool":
|
||||
payload["tool_call_id"] = message.tool_call_id
|
||||
payload["name"] = message.name
|
||||
@@ -109,6 +132,50 @@ class ChatViewSet(viewsets.ModelViewSet):
|
||||
if function_data.get("arguments"):
|
||||
current["function"]["arguments"] += function_data.get("arguments")
|
||||
|
||||
@staticmethod
|
||||
def _is_required_param_tool_error(result):
|
||||
if not isinstance(result, dict):
|
||||
return False
|
||||
|
||||
error_text = result.get("error")
|
||||
if not isinstance(error_text, str):
|
||||
return False
|
||||
|
||||
normalized_error = error_text.strip().lower()
|
||||
if normalized_error in {"location is required", "query is required"}:
|
||||
return True
|
||||
|
||||
return bool(
|
||||
re.fullmatch(
|
||||
r"[a-z0-9_,\-\s]+ (is|are) required",
|
||||
normalized_error,
|
||||
)
|
||||
)
|
||||
|
||||
@classmethod
|
||||
def _is_required_param_tool_error_message_content(cls, content):
|
||||
if not isinstance(content, str):
|
||||
return False
|
||||
|
||||
try:
|
||||
parsed = json.loads(content)
|
||||
except json.JSONDecodeError:
|
||||
return False
|
||||
|
||||
return cls._is_required_param_tool_error(parsed)
|
||||
|
||||
@staticmethod
|
||||
def _build_required_param_error_event(tool_name, result):
|
||||
tool_error = result.get("error") if isinstance(result, dict) else ""
|
||||
return {
|
||||
"error": (
|
||||
"The assistant attempted to call "
|
||||
f"'{tool_name}' without required arguments ({tool_error}). "
|
||||
"Please try your message again with more specific details."
|
||||
),
|
||||
"error_category": "tool_validation_error",
|
||||
}
|
||||
|
||||
@action(detail=True, methods=["post"])
|
||||
def send_message(self, request, pk=None):
|
||||
# Auto-learn preferences from user's travel history
|
||||
@@ -128,8 +195,30 @@ class ChatViewSet(viewsets.ModelViewSet):
|
||||
status=status.HTTP_400_BAD_REQUEST,
|
||||
)
|
||||
|
||||
provider = (request.data.get("provider") or "openai").strip().lower()
|
||||
model = (request.data.get("model") or "").strip() or None
|
||||
requested_provider = (request.data.get("provider") or "").strip().lower()
|
||||
requested_model = (request.data.get("model") or "").strip() or None
|
||||
ai_settings = UserAISettings.objects.filter(user=request.user).first()
|
||||
preferred_provider = (
|
||||
(ai_settings.preferred_provider or "").strip().lower()
|
||||
if ai_settings
|
||||
else ""
|
||||
)
|
||||
preferred_model = (
|
||||
(ai_settings.preferred_model or "").strip() if ai_settings else ""
|
||||
)
|
||||
|
||||
provider = requested_provider
|
||||
if not provider and preferred_provider:
|
||||
if preferred_provider and is_chat_provider_available(preferred_provider):
|
||||
provider = preferred_provider
|
||||
|
||||
if not provider:
|
||||
provider = "openai"
|
||||
|
||||
model = requested_model
|
||||
if model is None and preferred_model and provider == preferred_provider:
|
||||
model = preferred_model
|
||||
|
||||
collection_id = request.data.get("collection_id")
|
||||
collection_name = request.data.get("collection_name")
|
||||
start_date = request.data.get("start_date")
|
||||
@@ -266,29 +355,16 @@ class ChatViewSet(viewsets.ModelViewSet):
|
||||
)
|
||||
|
||||
if encountered_error:
|
||||
yield "data: [DONE]\n\n"
|
||||
break
|
||||
|
||||
assistant_content = "".join(content_chunks)
|
||||
|
||||
if tool_calls_accumulator:
|
||||
assistant_with_tools = {
|
||||
"role": "assistant",
|
||||
"content": assistant_content,
|
||||
"tool_calls": tool_calls_accumulator,
|
||||
}
|
||||
current_messages.append(assistant_with_tools)
|
||||
|
||||
await sync_to_async(
|
||||
ChatMessage.objects.create, thread_sensitive=True
|
||||
)(
|
||||
conversation=conversation,
|
||||
role="assistant",
|
||||
content=assistant_content,
|
||||
tool_calls=tool_calls_accumulator,
|
||||
)
|
||||
await sync_to_async(conversation.save, thread_sensitive=True)(
|
||||
update_fields=["updated_at"]
|
||||
)
|
||||
tool_iterations += 1
|
||||
successful_tool_calls = []
|
||||
successful_tool_messages = []
|
||||
successful_tool_chat_entries = []
|
||||
|
||||
for tool_call in tool_calls_accumulator:
|
||||
function_payload = tool_call.get("function") or {}
|
||||
@@ -309,9 +385,58 @@ class ChatViewSet(viewsets.ModelViewSet):
|
||||
request.user,
|
||||
**arguments,
|
||||
)
|
||||
|
||||
if self._is_required_param_tool_error(result):
|
||||
assistant_message_kwargs = {
|
||||
"conversation": conversation,
|
||||
"role": "assistant",
|
||||
"content": assistant_content,
|
||||
}
|
||||
if successful_tool_calls:
|
||||
assistant_message_kwargs["tool_calls"] = (
|
||||
successful_tool_calls
|
||||
)
|
||||
|
||||
await sync_to_async(
|
||||
ChatMessage.objects.create, thread_sensitive=True
|
||||
)(**assistant_message_kwargs)
|
||||
|
||||
for tool_message in successful_tool_messages:
|
||||
await sync_to_async(
|
||||
ChatMessage.objects.create,
|
||||
thread_sensitive=True,
|
||||
)(**tool_message)
|
||||
|
||||
await sync_to_async(
|
||||
conversation.save,
|
||||
thread_sensitive=True,
|
||||
)(update_fields=["updated_at"])
|
||||
|
||||
logger.info(
|
||||
"Stopping chat tool loop due to required-arg tool validation error: %s (%s)",
|
||||
function_name,
|
||||
result.get("error"),
|
||||
)
|
||||
error_event = self._build_required_param_error_event(
|
||||
function_name,
|
||||
result,
|
||||
)
|
||||
yield f"data: {json.dumps(error_event)}\n\n"
|
||||
yield "data: [DONE]\n\n"
|
||||
return
|
||||
|
||||
result_content = serialize_tool_result(result)
|
||||
|
||||
current_messages.append(
|
||||
successful_tool_calls.append(tool_call)
|
||||
tool_message_payload = {
|
||||
"conversation": conversation,
|
||||
"role": "tool",
|
||||
"content": result_content,
|
||||
"tool_call_id": tool_call.get("id"),
|
||||
"name": function_name,
|
||||
}
|
||||
successful_tool_messages.append(tool_message_payload)
|
||||
successful_tool_chat_entries.append(
|
||||
{
|
||||
"role": "tool",
|
||||
"tool_call_id": tool_call.get("id"),
|
||||
@@ -320,19 +445,6 @@ class ChatViewSet(viewsets.ModelViewSet):
|
||||
}
|
||||
)
|
||||
|
||||
await sync_to_async(
|
||||
ChatMessage.objects.create, thread_sensitive=True
|
||||
)(
|
||||
conversation=conversation,
|
||||
role="tool",
|
||||
content=result_content,
|
||||
tool_call_id=tool_call.get("id"),
|
||||
name=function_name,
|
||||
)
|
||||
await sync_to_async(conversation.save, thread_sensitive=True)(
|
||||
update_fields=["updated_at"]
|
||||
)
|
||||
|
||||
tool_event = {
|
||||
"tool_result": {
|
||||
"tool_call_id": tool_call.get("id"),
|
||||
@@ -342,6 +454,32 @@ class ChatViewSet(viewsets.ModelViewSet):
|
||||
}
|
||||
yield f"data: {json.dumps(tool_event)}\n\n"
|
||||
|
||||
assistant_with_tools = {
|
||||
"role": "assistant",
|
||||
"content": assistant_content,
|
||||
"tool_calls": successful_tool_calls,
|
||||
}
|
||||
current_messages.append(assistant_with_tools)
|
||||
current_messages.extend(successful_tool_chat_entries)
|
||||
|
||||
await sync_to_async(
|
||||
ChatMessage.objects.create, thread_sensitive=True
|
||||
)(
|
||||
conversation=conversation,
|
||||
role="assistant",
|
||||
content=assistant_content,
|
||||
tool_calls=successful_tool_calls,
|
||||
)
|
||||
for tool_message in successful_tool_messages:
|
||||
await sync_to_async(
|
||||
ChatMessage.objects.create,
|
||||
thread_sensitive=True,
|
||||
)(**tool_message)
|
||||
|
||||
await sync_to_async(conversation.save, thread_sensitive=True)(
|
||||
update_fields=["updated_at"]
|
||||
)
|
||||
|
||||
continue
|
||||
|
||||
await sync_to_async(ChatMessage.objects.create, thread_sensitive=True)(
|
||||
@@ -355,6 +493,18 @@ class ChatViewSet(viewsets.ModelViewSet):
|
||||
yield "data: [DONE]\n\n"
|
||||
break
|
||||
|
||||
if tool_iterations >= MAX_TOOL_ITERATIONS:
|
||||
logger.warning(
|
||||
"Stopping chat tool loop after max iterations (%s)",
|
||||
MAX_TOOL_ITERATIONS,
|
||||
)
|
||||
payload = {
|
||||
"error": "The assistant stopped after too many tool calls. Please try again with a more specific request.",
|
||||
"error_category": "tool_loop_limit",
|
||||
}
|
||||
yield f"data: {json.dumps(payload)}\n\n"
|
||||
yield "data: [DONE]\n\n"
|
||||
|
||||
response = StreamingHttpResponse(
|
||||
streaming_content=self._async_to_sync_generator(event_stream()),
|
||||
content_type="text/event-stream",
|
||||
|
||||
@@ -1,7 +1,9 @@
|
||||
import logging
|
||||
import json
|
||||
import re
|
||||
|
||||
import litellm
|
||||
from django.conf import settings
|
||||
from django.shortcuts import get_object_or_404
|
||||
from rest_framework import status
|
||||
from rest_framework.permissions import IsAuthenticated
|
||||
@@ -11,10 +13,17 @@ from rest_framework.views import APIView
|
||||
from adventures.models import Collection
|
||||
from chat.agent_tools import search_places
|
||||
from chat.llm_client import (
|
||||
CHAT_PROVIDER_CONFIG,
|
||||
_safe_error_payload,
|
||||
get_llm_api_key,
|
||||
get_system_prompt,
|
||||
is_chat_provider_available,
|
||||
normalize_gateway_model,
|
||||
)
|
||||
from integrations.models import UserAISettings
|
||||
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class DaySuggestionsView(APIView):
|
||||
@@ -52,7 +61,7 @@ class DaySuggestionsView(APIView):
|
||||
|
||||
location = location_context or self._get_collection_location(collection)
|
||||
system_prompt = get_system_prompt(request.user, collection)
|
||||
provider = "openai"
|
||||
provider, model = self._resolve_provider_and_model(request)
|
||||
|
||||
if not is_chat_provider_available(provider):
|
||||
return Response(
|
||||
@@ -78,12 +87,22 @@ class DaySuggestionsView(APIView):
|
||||
user_prompt=prompt,
|
||||
user=request.user,
|
||||
provider=provider,
|
||||
model=model,
|
||||
)
|
||||
return Response({"suggestions": suggestions}, status=status.HTTP_200_OK)
|
||||
except Exception:
|
||||
except Exception as exc:
|
||||
logger.exception("Failed to generate day suggestions")
|
||||
payload = _safe_error_payload(exc)
|
||||
status_code = {
|
||||
"model_not_found": status.HTTP_400_BAD_REQUEST,
|
||||
"authentication_failed": status.HTTP_401_UNAUTHORIZED,
|
||||
"rate_limited": status.HTTP_429_TOO_MANY_REQUESTS,
|
||||
"invalid_request": status.HTTP_400_BAD_REQUEST,
|
||||
"provider_unreachable": status.HTTP_503_SERVICE_UNAVAILABLE,
|
||||
}.get(payload.get("error_category"), status.HTTP_500_INTERNAL_SERVER_ERROR)
|
||||
return Response(
|
||||
{"error": "Failed to generate suggestions. Please try again."},
|
||||
status=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||||
payload,
|
||||
status=status_code,
|
||||
)
|
||||
|
||||
def _get_collection_location(self, collection):
|
||||
@@ -174,31 +193,98 @@ class DaySuggestionsView(APIView):
|
||||
category=tool_category_map.get(category, "tourism"),
|
||||
radius=8,
|
||||
)
|
||||
if not isinstance(result, dict):
|
||||
return ""
|
||||
if result.get("error"):
|
||||
return ""
|
||||
|
||||
raw_results = result.get("results")
|
||||
if not isinstance(raw_results, list):
|
||||
return ""
|
||||
|
||||
entries = []
|
||||
for place in result.get("results", [])[:5]:
|
||||
for place in raw_results[:5]:
|
||||
if not isinstance(place, dict):
|
||||
continue
|
||||
name = place.get("name")
|
||||
address = place.get("address") or ""
|
||||
if name:
|
||||
entries.append(f"{name} ({address})" if address else name)
|
||||
return "; ".join(entries)
|
||||
|
||||
def _get_suggestions_from_llm(self, system_prompt, user_prompt, user, provider):
|
||||
def _resolve_provider_and_model(self, request):
|
||||
request_provider = (request.data.get("provider") or "").strip().lower() or None
|
||||
request_model = (request.data.get("model") or "").strip() or None
|
||||
|
||||
user_settings = UserAISettings.objects.filter(user=request.user).first() # type: ignore[attr-defined]
|
||||
preferred_provider = (
|
||||
(user_settings.preferred_provider or "").strip().lower()
|
||||
if user_settings and user_settings.preferred_provider
|
||||
else None
|
||||
)
|
||||
preferred_model = (
|
||||
(user_settings.preferred_model or "").strip()
|
||||
if user_settings and user_settings.preferred_model
|
||||
else None
|
||||
)
|
||||
|
||||
settings_provider = (settings.VOYAGE_AI_PROVIDER or "").strip().lower() or None
|
||||
|
||||
provider = request_provider or preferred_provider or settings_provider
|
||||
if not provider or not is_chat_provider_available(provider):
|
||||
provider = (
|
||||
settings_provider
|
||||
if is_chat_provider_available(settings_provider)
|
||||
else None
|
||||
)
|
||||
if not provider or not is_chat_provider_available(provider):
|
||||
provider = "openai" if is_chat_provider_available("openai") else provider
|
||||
|
||||
provider_config = CHAT_PROVIDER_CONFIG.get(provider or "", {})
|
||||
default_model = (
|
||||
(settings.VOYAGE_AI_MODEL or "").strip()
|
||||
if provider == settings_provider and settings.VOYAGE_AI_MODEL
|
||||
else None
|
||||
) or provider_config.get("default_model")
|
||||
|
||||
model_from_user_defaults = (
|
||||
preferred_model
|
||||
if preferred_provider and provider == preferred_provider
|
||||
else None
|
||||
)
|
||||
model = request_model or model_from_user_defaults or default_model
|
||||
return provider, model
|
||||
|
||||
def _get_suggestions_from_llm(
|
||||
self, system_prompt, user_prompt, user, provider, model
|
||||
):
|
||||
api_key = get_llm_api_key(user, provider)
|
||||
if not api_key:
|
||||
raise ValueError("No API key available")
|
||||
|
||||
response = litellm.completion(
|
||||
model="gpt-4o-mini",
|
||||
messages=[
|
||||
provider_config = CHAT_PROVIDER_CONFIG.get(provider, {})
|
||||
resolved_model = normalize_gateway_model(
|
||||
provider,
|
||||
model or provider_config.get("default_model"),
|
||||
)
|
||||
if not resolved_model:
|
||||
raise ValueError("No model configured for provider")
|
||||
|
||||
completion_kwargs = {
|
||||
"model": resolved_model,
|
||||
"messages": [
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": user_prompt},
|
||||
],
|
||||
api_key=api_key,
|
||||
temperature=0.7,
|
||||
max_tokens=1000,
|
||||
"api_key": api_key,
|
||||
"max_tokens": 1000,
|
||||
}
|
||||
|
||||
if provider_config.get("api_base"):
|
||||
completion_kwargs["api_base"] = provider_config["api_base"]
|
||||
|
||||
response = litellm.completion(
|
||||
**completion_kwargs,
|
||||
)
|
||||
|
||||
content = (response.choices[0].message.content or "").strip()
|
||||
|
||||
@@ -26,7 +26,11 @@ The term "Location" is now used instead of "Adventure" - the usage remains the s
|
||||
|
||||
The AI travel chat is embedded in the **Collections → Recommendations** view. Select a collection, switch to the Recommendations tab, and use the chat to brainstorm destinations, ask for travel advice, or get location suggestions. The chat supports multiple LLM providers — configure your API key in **Settings → API Keys** and pick a provider from the dropdown in the chat interface. The provider list is loaded dynamically from the backend, so any provider supported by LiteLLM (plus OpenCode Zen) is available.
|
||||
|
||||
You can also override the default model for any provider by typing a model name in the **Model** input next to the provider selector (e.g. `openai/gpt-5-nano`). Your model choice is saved per-provider in the browser. If the model field is left empty, the provider's default model is used. Provider errors (authentication, model not found, rate limits) are displayed as clear, actionable messages in the chat.
|
||||
You can set a **default AI provider and model** in **Settings** (under "Default AI Settings"). These saved defaults are used automatically when you open a new chat or request day suggestions. The defaults are stored in the database and apply across all your devices — they take priority over any previous per-browser model selections. You can still override the provider and model in the chat interface for individual conversations.
|
||||
|
||||
Day suggestions (the AI-generated place recommendations for specific itinerary days) also respect your saved default provider and model. If no defaults are saved, the instance-level provider configured by the server admin is used.
|
||||
|
||||
Provider errors (authentication, model not found, rate limits, invalid tool calls) are displayed as clear, actionable messages in the chat. If the AI attempts to use a tool incorrectly (e.g., missing required parameters), the error is surfaced once and the conversation stops cleanly rather than looping.
|
||||
|
||||
#### Collections
|
||||
|
||||
|
||||
@@ -36,6 +36,11 @@
|
||||
user_configured: boolean;
|
||||
};
|
||||
|
||||
type UserAISettingsResponse = {
|
||||
preferred_provider: string | null;
|
||||
preferred_model: string | null;
|
||||
};
|
||||
|
||||
export let embedded = false;
|
||||
export let collectionId: string | undefined = undefined;
|
||||
export let collectionName: string | undefined = undefined;
|
||||
@@ -58,6 +63,10 @@
|
||||
let chatProviders: ChatProviderCatalogConfiguredEntry[] = [];
|
||||
let providerError = '';
|
||||
let selectedProviderDefaultModel = '';
|
||||
let savedDefaultProvider = '';
|
||||
let savedDefaultModel = '';
|
||||
let initialDefaultsApplied = false;
|
||||
let loadedModelsForProvider = '';
|
||||
let showDateSelector = false;
|
||||
let selectedPlaceToAdd: PlaceResult | null = null;
|
||||
let selectedDate = '';
|
||||
@@ -68,13 +77,65 @@
|
||||
}>();
|
||||
|
||||
const MODEL_PREFS_STORAGE_KEY = 'voyage_chat_model_prefs';
|
||||
let initializedModelProvider = '';
|
||||
$: promptTripContext = collectionName || destination || '';
|
||||
|
||||
onMount(async () => {
|
||||
await Promise.all([loadConversations(), loadProviderCatalog()]);
|
||||
await Promise.all([loadConversations(), loadProviderCatalog(), loadUserAISettings()]);
|
||||
await applyInitialDefaults();
|
||||
});
|
||||
|
||||
async function loadUserAISettings(): Promise<void> {
|
||||
try {
|
||||
const res = await fetch('/api/integrations/ai-settings/', {
|
||||
credentials: 'include'
|
||||
});
|
||||
if (!res.ok) {
|
||||
return;
|
||||
}
|
||||
|
||||
const settings = (await res.json()) as UserAISettingsResponse[];
|
||||
const first = settings[0];
|
||||
if (!first) {
|
||||
return;
|
||||
}
|
||||
|
||||
savedDefaultProvider = (first.preferred_provider || '').trim().toLowerCase();
|
||||
savedDefaultModel = (first.preferred_model || '').trim();
|
||||
} catch (e) {
|
||||
console.error('Failed to load AI settings:', e);
|
||||
}
|
||||
}
|
||||
|
||||
async function applyInitialDefaults(): Promise<void> {
|
||||
if (initialDefaultsApplied || chatProviders.length === 0) {
|
||||
return;
|
||||
}
|
||||
|
||||
if (
|
||||
savedDefaultProvider &&
|
||||
chatProviders.some((provider) => provider.id === savedDefaultProvider)
|
||||
) {
|
||||
selectedProvider = savedDefaultProvider;
|
||||
} else {
|
||||
const userConfigured = chatProviders.find((provider) => provider.user_configured);
|
||||
selectedProvider = (userConfigured || chatProviders[0]).id;
|
||||
}
|
||||
|
||||
await loadModelsForProvider(selectedProvider);
|
||||
|
||||
if (savedDefaultModel && selectedProvider === savedDefaultProvider) {
|
||||
selectedModel = availableModels.includes(savedDefaultModel)
|
||||
? savedDefaultModel
|
||||
: selectedProviderDefaultModel || availableModels[0] || '';
|
||||
} else {
|
||||
selectedModel = selectedProviderDefaultModel || availableModels[0] || '';
|
||||
}
|
||||
|
||||
saveModelPref(selectedProvider, selectedModel);
|
||||
loadedModelsForProvider = selectedProvider;
|
||||
initialDefaultsApplied = true;
|
||||
}
|
||||
|
||||
async function loadProviderCatalog(): Promise<void> {
|
||||
try {
|
||||
const res = await fetch('/api/chat/providers/', {
|
||||
@@ -98,9 +159,8 @@
|
||||
|
||||
if (usable.length > 0) {
|
||||
providerError = '';
|
||||
if (!selectedProvider || !usable.some((provider) => provider.id === selectedProvider)) {
|
||||
const userConfigured = usable.find((provider) => provider.user_configured);
|
||||
selectedProvider = (userConfigured || usable[0]).id;
|
||||
if (selectedProvider && !usable.some((provider) => provider.id === selectedProvider)) {
|
||||
selectedProvider = '';
|
||||
}
|
||||
} else {
|
||||
selectedProvider = '';
|
||||
@@ -113,24 +173,21 @@
|
||||
}
|
||||
}
|
||||
|
||||
async function loadModelsForProvider() {
|
||||
if (!selectedProvider) {
|
||||
async function loadModelsForProvider(providerId: string) {
|
||||
if (!providerId) {
|
||||
availableModels = [];
|
||||
return;
|
||||
}
|
||||
|
||||
modelsLoading = true;
|
||||
try {
|
||||
const res = await fetch(`/api/chat/providers/${selectedProvider}/models/`, {
|
||||
const res = await fetch(`/api/chat/providers/${providerId}/models/`, {
|
||||
credentials: 'include'
|
||||
});
|
||||
const data = await res.json();
|
||||
|
||||
if (data.models && data.models.length > 0) {
|
||||
availableModels = data.models;
|
||||
if (!selectedModel || !availableModels.includes(selectedModel)) {
|
||||
selectedModel = availableModels[0];
|
||||
}
|
||||
} else {
|
||||
availableModels = [];
|
||||
}
|
||||
@@ -142,25 +199,6 @@
|
||||
}
|
||||
}
|
||||
|
||||
function loadModelPref(provider: string): string {
|
||||
if (typeof window === 'undefined') {
|
||||
return '';
|
||||
}
|
||||
|
||||
try {
|
||||
const raw = window.localStorage.getItem(MODEL_PREFS_STORAGE_KEY);
|
||||
if (!raw) {
|
||||
return '';
|
||||
}
|
||||
|
||||
const prefs = JSON.parse(raw) as Record<string, string>;
|
||||
const value = prefs[provider];
|
||||
return typeof value === 'string' ? value : '';
|
||||
} catch {
|
||||
return '';
|
||||
}
|
||||
}
|
||||
|
||||
function saveModelPref(provider: string, model: string) {
|
||||
if (typeof window === 'undefined') {
|
||||
return;
|
||||
@@ -176,20 +214,26 @@
|
||||
}
|
||||
}
|
||||
|
||||
$: if (selectedProvider && initializedModelProvider !== selectedProvider) {
|
||||
selectedModel = loadModelPref(selectedProvider) || selectedProviderDefaultModel || '';
|
||||
initializedModelProvider = selectedProvider;
|
||||
}
|
||||
|
||||
$: if (selectedProvider && initializedModelProvider === selectedProvider) {
|
||||
saveModelPref(selectedProvider, selectedModel);
|
||||
}
|
||||
|
||||
$: selectedProviderDefaultModel =
|
||||
chatProviders.find((provider) => provider.id === selectedProvider)?.default_model ?? '';
|
||||
|
||||
$: if (selectedProvider) {
|
||||
void loadModelsForProvider();
|
||||
$: if (
|
||||
selectedProvider &&
|
||||
initialDefaultsApplied &&
|
||||
loadedModelsForProvider !== selectedProvider
|
||||
) {
|
||||
loadedModelsForProvider = selectedProvider;
|
||||
void (async () => {
|
||||
await loadModelsForProvider(selectedProvider);
|
||||
if (!selectedModel || !availableModels.includes(selectedModel)) {
|
||||
selectedModel = selectedProviderDefaultModel || availableModels[0] || '';
|
||||
}
|
||||
saveModelPref(selectedProvider, selectedModel);
|
||||
})();
|
||||
}
|
||||
|
||||
$: if (selectedProvider && initialDefaultsApplied) {
|
||||
saveModelPref(selectedProvider, selectedModel);
|
||||
}
|
||||
|
||||
async function loadConversations() {
|
||||
|
||||
@@ -14,6 +14,7 @@
|
||||
name: string;
|
||||
description?: string;
|
||||
why_fits?: string;
|
||||
category?: string;
|
||||
location?: string;
|
||||
rating?: number | string | null;
|
||||
price_level?: string | null;
|
||||
@@ -118,6 +119,94 @@
|
||||
return nextFilters;
|
||||
}
|
||||
|
||||
function asRecord(value: unknown): Record<string, unknown> | null {
|
||||
if (!value || typeof value !== 'object' || Array.isArray(value)) {
|
||||
return null;
|
||||
}
|
||||
return value as Record<string, unknown>;
|
||||
}
|
||||
|
||||
function normalizeText(value: unknown): string {
|
||||
if (typeof value !== 'string') return '';
|
||||
return value.trim();
|
||||
}
|
||||
|
||||
function normalizeRating(value: unknown): number | null {
|
||||
if (typeof value === 'number' && Number.isFinite(value)) {
|
||||
return value;
|
||||
}
|
||||
|
||||
if (typeof value === 'string') {
|
||||
const match = value.match(/\d+(\.\d+)?/);
|
||||
if (!match) return null;
|
||||
const parsed = Number(match[0]);
|
||||
return Number.isFinite(parsed) ? parsed : null;
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
function normalizeSuggestionItem(value: unknown): SuggestionItem | null {
|
||||
const item = asRecord(value);
|
||||
if (!item) return null;
|
||||
|
||||
const name =
|
||||
normalizeText(item.name) ||
|
||||
normalizeText(item.title) ||
|
||||
normalizeText(item.place_name) ||
|
||||
normalizeText(item.venue);
|
||||
const description =
|
||||
normalizeText(item.description) || normalizeText(item.summary) || normalizeText(item.details);
|
||||
const whyFits =
|
||||
normalizeText(item.why_fits) || normalizeText(item.whyFits) || normalizeText(item.reason);
|
||||
const location =
|
||||
normalizeText(item.location) ||
|
||||
normalizeText(item.address) ||
|
||||
normalizeText(item.neighborhood);
|
||||
const category = normalizeText(item.category);
|
||||
const priceLevel =
|
||||
normalizeText(item.price_level) ||
|
||||
normalizeText(item.priceLevel) ||
|
||||
normalizeText(item.price);
|
||||
const rating = normalizeRating(item.rating ?? item.score);
|
||||
|
||||
const finalName = name || location;
|
||||
if (!finalName) return null;
|
||||
|
||||
return {
|
||||
name: finalName,
|
||||
description: description || undefined,
|
||||
why_fits: whyFits || undefined,
|
||||
category: category || undefined,
|
||||
location: location || undefined,
|
||||
rating,
|
||||
price_level: priceLevel || null
|
||||
};
|
||||
}
|
||||
|
||||
function buildLocationPayload(suggestion: SuggestionItem) {
|
||||
const name =
|
||||
normalizeText(suggestion.name) || normalizeText(suggestion.location) || 'Suggestion';
|
||||
const locationText =
|
||||
normalizeText(suggestion.location) ||
|
||||
getCollectionLocation() ||
|
||||
normalizeText(suggestion.name);
|
||||
const description =
|
||||
normalizeText(suggestion.description) ||
|
||||
normalizeText(suggestion.why_fits) ||
|
||||
(suggestion.category ? `${suggestion.category} suggestion` : '');
|
||||
const rating = normalizeRating(suggestion.rating);
|
||||
|
||||
return {
|
||||
name,
|
||||
description,
|
||||
location: locationText || name,
|
||||
rating,
|
||||
collections: [collection.id],
|
||||
is_public: false
|
||||
};
|
||||
}
|
||||
|
||||
async function fetchSuggestions() {
|
||||
if (!selectedCategory) return;
|
||||
|
||||
@@ -144,7 +233,11 @@
|
||||
}
|
||||
|
||||
const data = await response.json();
|
||||
suggestions = Array.isArray(data?.suggestions) ? data.suggestions : [];
|
||||
suggestions = Array.isArray(data?.suggestions)
|
||||
? data.suggestions
|
||||
.map((item: unknown) => normalizeSuggestionItem(item))
|
||||
.filter((item: SuggestionItem | null): item is SuggestionItem => item !== null)
|
||||
: [];
|
||||
} catch (_err) {
|
||||
error = $t('suggestions.error');
|
||||
suggestions = [];
|
||||
@@ -180,17 +273,12 @@
|
||||
error = '';
|
||||
|
||||
try {
|
||||
const payload = buildLocationPayload(suggestion);
|
||||
const createLocationResponse = await fetch('/api/locations/', {
|
||||
method: 'POST',
|
||||
credentials: 'include',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({
|
||||
name: suggestion.name,
|
||||
description: suggestion.description || suggestion.why_fits || '',
|
||||
location: suggestion.location || getCollectionLocation() || suggestion.name,
|
||||
collections: [collection.id],
|
||||
is_public: false
|
||||
})
|
||||
body: JSON.stringify(payload)
|
||||
});
|
||||
|
||||
if (!createLocationResponse.ok) {
|
||||
|
||||
@@ -587,6 +587,14 @@ export type UserRecommendationPreferenceProfile = {
|
||||
updated_at: string;
|
||||
};
|
||||
|
||||
export type UserAISettings = {
|
||||
id: string;
|
||||
preferred_provider: string | null;
|
||||
preferred_model: string | null;
|
||||
created_at: string;
|
||||
updated_at: string;
|
||||
};
|
||||
|
||||
export type CollectionItineraryDay = {
|
||||
id: string;
|
||||
collection: string; // UUID of the collection
|
||||
|
||||
@@ -817,6 +817,13 @@
|
||||
"travel_agent_help_body": "Open a collection and switch to Recommendations to interact with the travel agent for place suggestions.",
|
||||
"travel_agent_help_open_collections": "Open Collections",
|
||||
"travel_agent_help_setup_guide": "Travel agent setup guide",
|
||||
"default_ai_settings_title": "Default AI Provider & Model",
|
||||
"default_ai_settings_desc": "Choose the default AI provider and model used across chat experiences.",
|
||||
"default_ai_no_providers": "No configured AI providers are available yet. Add an API key first.",
|
||||
"default_ai_save": "Save default AI settings",
|
||||
"default_ai_settings_saved": "Default AI settings saved.",
|
||||
"default_ai_settings_error": "Unable to save default AI settings.",
|
||||
"default_ai_provider_required": "Please select a provider.",
|
||||
"travel_preferences": "Travel Preferences",
|
||||
"travel_preferences_desc": "Customize your travel preferences for better AI recommendations",
|
||||
"cuisines": "Favorite Cuisines",
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
import { fail, redirect, type Actions } from '@sveltejs/kit';
|
||||
import type { PageServerLoad } from '../$types';
|
||||
const PUBLIC_SERVER_URL = process.env['PUBLIC_SERVER_URL'];
|
||||
import type { ImmichIntegration, User } from '$lib/types';
|
||||
import type { ImmichIntegration, User, UserAISettings } from '$lib/types';
|
||||
import { fetchCSRFToken } from '$lib/index.server';
|
||||
const endpoint = PUBLIC_SERVER_URL || 'http://localhost:8000';
|
||||
|
||||
@@ -95,6 +95,7 @@ export const load: PageServerLoad = async (event) => {
|
||||
|
||||
let apiKeys: UserAPIKey[] = [];
|
||||
let apiKeysConfigError: string | null = null;
|
||||
let aiSettings: UserAISettings | null = null;
|
||||
let apiKeysFetch = await fetch(`${endpoint}/api/integrations/api-keys/`, {
|
||||
headers: {
|
||||
Cookie: `sessionid=${sessionId}`
|
||||
@@ -108,6 +109,17 @@ export const load: PageServerLoad = async (event) => {
|
||||
apiKeysConfigError = errorBody.detail ?? 'API key storage is currently unavailable.';
|
||||
}
|
||||
|
||||
let aiSettingsFetch = await fetch(`${endpoint}/api/integrations/ai-settings/`, {
|
||||
headers: {
|
||||
Cookie: `sessionid=${sessionId}`
|
||||
}
|
||||
});
|
||||
|
||||
if (aiSettingsFetch.ok) {
|
||||
const aiSettingsResponse = (await aiSettingsFetch.json()) as UserAISettings[];
|
||||
aiSettings = aiSettingsResponse[0] ?? null;
|
||||
}
|
||||
|
||||
let publicUrlFetch = await fetch(`${endpoint}/public-url/`);
|
||||
let publicUrl = '';
|
||||
if (!publicUrlFetch.ok) {
|
||||
@@ -131,6 +143,7 @@ export const load: PageServerLoad = async (event) => {
|
||||
stravaUserEnabled,
|
||||
apiKeys,
|
||||
apiKeysConfigError,
|
||||
aiSettings,
|
||||
wandererEnabled,
|
||||
wandererExpired
|
||||
}
|
||||
|
||||
@@ -47,6 +47,11 @@
|
||||
let apiKeysConfigError: string | null = data.props.apiKeysConfigError ?? null;
|
||||
let newApiKeyProvider = 'anthropic';
|
||||
let providerCatalog: ChatProviderCatalogEntry[] = [];
|
||||
let defaultAiProvider = (data.props.aiSettings?.preferred_provider ?? '').trim().toLowerCase();
|
||||
let defaultAiModel = (data.props.aiSettings?.preferred_model ?? '').trim();
|
||||
let defaultAiModels: string[] = [];
|
||||
let defaultAiModelsLoading = false;
|
||||
let isSavingDefaultAiSettings = false;
|
||||
let newApiKeyValue = '';
|
||||
let isSavingApiKey = false;
|
||||
let deletingApiKeyId: string | null = null;
|
||||
@@ -70,6 +75,104 @@
|
||||
}
|
||||
}
|
||||
|
||||
function getConfiguredChatProviders() {
|
||||
return providerCatalog.filter(
|
||||
(provider) =>
|
||||
provider.available_for_chat && (provider.user_configured || provider.instance_configured)
|
||||
);
|
||||
}
|
||||
|
||||
async function loadDefaultAiModels(providerId: string) {
|
||||
if (!providerId) {
|
||||
defaultAiModels = [];
|
||||
return;
|
||||
}
|
||||
|
||||
defaultAiModelsLoading = true;
|
||||
try {
|
||||
const res = await fetch(`/api/chat/providers/${providerId}/models/`);
|
||||
if (!res.ok) {
|
||||
defaultAiModels = [];
|
||||
return;
|
||||
}
|
||||
|
||||
const payload = await res.json();
|
||||
defaultAiModels = Array.isArray(payload.models) ? (payload.models as string[]) : [];
|
||||
} catch {
|
||||
defaultAiModels = [];
|
||||
} finally {
|
||||
defaultAiModelsLoading = false;
|
||||
}
|
||||
}
|
||||
|
||||
async function initializeDefaultAiSettings() {
|
||||
const configuredProviders = getConfiguredChatProviders();
|
||||
if (!configuredProviders.length) {
|
||||
defaultAiProvider = '';
|
||||
defaultAiModel = '';
|
||||
defaultAiModels = [];
|
||||
return;
|
||||
}
|
||||
|
||||
if (
|
||||
!defaultAiProvider ||
|
||||
!configuredProviders.some((provider) => provider.id === defaultAiProvider)
|
||||
) {
|
||||
defaultAiProvider = configuredProviders[0].id;
|
||||
defaultAiModel = '';
|
||||
}
|
||||
|
||||
await loadDefaultAiModels(defaultAiProvider);
|
||||
if (defaultAiModel && !defaultAiModels.includes(defaultAiModel)) {
|
||||
defaultAiModel = '';
|
||||
}
|
||||
}
|
||||
|
||||
async function onDefaultAiProviderChange() {
|
||||
defaultAiModel = '';
|
||||
await loadDefaultAiModels(defaultAiProvider);
|
||||
}
|
||||
|
||||
async function saveDefaultAiSettings(event: SubmitEvent) {
|
||||
event.preventDefault();
|
||||
if (!defaultAiProvider) {
|
||||
addToast('error', $t('settings.default_ai_provider_required'));
|
||||
return;
|
||||
}
|
||||
|
||||
isSavingDefaultAiSettings = true;
|
||||
try {
|
||||
const res = await fetch('/api/integrations/ai-settings/', {
|
||||
method: 'POST',
|
||||
headers: {
|
||||
'Content-Type': 'application/json'
|
||||
},
|
||||
body: JSON.stringify({
|
||||
preferred_provider: defaultAiProvider,
|
||||
preferred_model: defaultAiModel || null
|
||||
})
|
||||
});
|
||||
|
||||
if (!res.ok) {
|
||||
addToast('error', $t('settings.default_ai_settings_error'));
|
||||
return;
|
||||
}
|
||||
|
||||
const saved = await res.json();
|
||||
defaultAiProvider = (saved.preferred_provider ?? '').trim().toLowerCase();
|
||||
defaultAiModel = (saved.preferred_model ?? '').trim();
|
||||
await loadDefaultAiModels(defaultAiProvider);
|
||||
if (defaultAiModel && !defaultAiModels.includes(defaultAiModel)) {
|
||||
defaultAiModel = '';
|
||||
}
|
||||
addToast('success', $t('settings.default_ai_settings_saved'));
|
||||
} catch {
|
||||
addToast('error', $t('settings.default_ai_settings_error'));
|
||||
} finally {
|
||||
isSavingDefaultAiSettings = false;
|
||||
}
|
||||
}
|
||||
|
||||
function getApiKeyProviderLabel(provider: string): string {
|
||||
const catalogProvider = providerCatalog.find((entry) => entry.id === provider);
|
||||
if (catalogProvider) {
|
||||
@@ -133,7 +236,8 @@
|
||||
];
|
||||
|
||||
onMount(async () => {
|
||||
void loadProviderCatalog();
|
||||
await loadProviderCatalog();
|
||||
await initializeDefaultAiSettings();
|
||||
|
||||
if (browser) {
|
||||
const queryParams = new URLSearchParams($page.url.search);
|
||||
@@ -1570,6 +1674,71 @@
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="p-6 bg-base-200 rounded-xl mb-6">
|
||||
<h3 class="text-lg font-semibold mb-2">
|
||||
{$t('settings.default_ai_settings_title')}
|
||||
</h3>
|
||||
<p class="text-sm text-base-content/70 mb-4">
|
||||
{$t('settings.default_ai_settings_desc')}
|
||||
</p>
|
||||
|
||||
{#if getConfiguredChatProviders().length === 0}
|
||||
<div class="alert alert-warning">
|
||||
<span>{$t('settings.default_ai_no_providers')}</span>
|
||||
</div>
|
||||
{:else}
|
||||
<form class="space-y-4" on:submit={saveDefaultAiSettings}>
|
||||
<div class="form-control">
|
||||
<label class="label" for="default-ai-provider">
|
||||
<span class="label-text font-medium">{$t('settings.provider')}</span>
|
||||
</label>
|
||||
<select
|
||||
id="default-ai-provider"
|
||||
class="select select-bordered select-primary w-full"
|
||||
bind:value={defaultAiProvider}
|
||||
on:change={onDefaultAiProviderChange}
|
||||
>
|
||||
{#each getConfiguredChatProviders() as provider}
|
||||
<option value={provider.id}>{provider.label}</option>
|
||||
{/each}
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<div class="form-control">
|
||||
<label class="label" for="default-ai-model">
|
||||
<span class="label-text font-medium">{$t('chat.model_label')}</span>
|
||||
</label>
|
||||
<select
|
||||
id="default-ai-model"
|
||||
class="select select-bordered select-primary w-full"
|
||||
bind:value={defaultAiModel}
|
||||
disabled={defaultAiModelsLoading}
|
||||
>
|
||||
<option value="">{$t('chat.model_placeholder')}</option>
|
||||
{#if defaultAiModelsLoading}
|
||||
<option value="" disabled>Loading...</option>
|
||||
{:else}
|
||||
{#each defaultAiModels as model}
|
||||
<option value={model}>{model}</option>
|
||||
{/each}
|
||||
{/if}
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<button
|
||||
class="btn btn-primary"
|
||||
type="submit"
|
||||
disabled={isSavingDefaultAiSettings}
|
||||
>
|
||||
{#if isSavingDefaultAiSettings}
|
||||
<span class="loading loading-spinner loading-sm"></span>
|
||||
{/if}
|
||||
{$t('settings.default_ai_save')}
|
||||
</button>
|
||||
</form>
|
||||
{/if}
|
||||
</div>
|
||||
|
||||
<div class="p-6 bg-base-200 rounded-xl mb-6">
|
||||
<h3 class="text-lg font-semibold mb-2">MCP Access Token</h3>
|
||||
<p class="text-sm text-base-content/70 mb-4">
|
||||
|
||||
Reference in New Issue
Block a user