fix(chat): add saved AI defaults and harden suggestions
This commit is contained in:
298
.memory/decisions.md
Normal file
298
.memory/decisions.md
Normal file
@@ -0,0 +1,298 @@
|
||||
# Voyage — Decisions Log
|
||||
|
||||
## Fork from AdventureLog
|
||||
- **Decision**: Fork AdventureLog and rebrand as Voyage
|
||||
- **Rationale**: Build on proven foundation while adding itinerary UI, OSRM routing, LLM travel agent, lodging logic
|
||||
- **Date**: Project inception
|
||||
|
||||
## Docker-Only Backend Development
|
||||
- **Decision**: Backend development requires Docker; local Python pip install is not supported
|
||||
- **Rationale**: Complex GDAL/PostGIS dependencies; pip install fails with network timeouts
|
||||
- **Impact**: All backend commands run via `docker compose exec server`
|
||||
|
||||
## API Proxy Pattern
|
||||
- **Decision**: Frontend proxies all API calls through SvelteKit server routes
|
||||
- **Rationale**: Handles CSRF tokens and session cookies transparently; avoids CORS issues
|
||||
- **Reference**: See [knowledge/overview.md](knowledge/overview.md#api-proxy-pattern)
|
||||
|
||||
## Package Manager: Bun (Frontend)
|
||||
- **Decision**: Use Bun as frontend package manager (bun.lock present)
|
||||
- **Note**: npm scripts still used for build/lint/check commands
|
||||
|
||||
## Tooling Preference: Bun + uv
|
||||
- **Decision**: Prefer `bun` for frontend workflows and `uv` for Python workflows.
|
||||
- **Rationale**: User preference for faster, consistent package/runtime tooling.
|
||||
- **Operational rule**:
|
||||
- Frontend: use `bun install` and `bun run <script>` by default.
|
||||
- Python: use `uv` for local Python dependency/tooling commands when applicable.
|
||||
- Docker-managed backend runtime commands (e.g. `docker compose exec server python3 manage.py ...`) remain unchanged unless project tooling is explicitly migrated.
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Workflow Preference: Commit + Merge When Done
|
||||
- **Decision**: Once a requested branch workstream is complete and validated, commit and merge promptly (do not leave completed branches unmerged).
|
||||
- **Rationale**: User preference for immediate integration and reduced branch drift.
|
||||
- **Operational rule**:
|
||||
- Ensure quality checks pass for the completed change set.
|
||||
- Commit feature branch changes.
|
||||
- Merge into target branch without delaying.
|
||||
- Clean up merged worktrees/branches after merge.
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## WS1-F2 Review: Remove standalone /chat route
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Scope**: Deletion of `frontend/src/routes/chat/+page.svelte`, removal of `/chat` nav item and `mdiRobotOutline` import from Navbar.svelte.
|
||||
- **Findings**: No broken imports, navigation links, or route references remain. All `/chat` string matches in codebase are `/api/chat/conversations/` backend API proxy calls (correct). Orphaned `navbar.chat` i18n key noted as minor cleanup suggestion.
|
||||
- **Reference**: See [Plan: AI travel agent](plans/ai-travel-agent-collections-integration.md#task-ws1-f2)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## WS1 Tester Validation: collections-ai-agent worktree
|
||||
- **Status**: PASS (Both Standard + Adversarial passes)
|
||||
- **Build**: `bun run build` artifacts validated via `.svelte-kit/adapter-node` and `build/` output. No `/chat` route in compiled manifest. `AITravelChat` SSR-inlined into collections page at `currentView === 'recommendations'` with `embedded: true`.
|
||||
- **Key findings**:
|
||||
- All 12 i18n keys used in `AITravelChat.svelte` confirmed present in `en.json`.
|
||||
- No `mdiRobotOutline`, `/chat` href, or chat nav references in any source `.svelte` files.
|
||||
- Navbar.svelte contains zero chat or robot icon references.
|
||||
- `CollectionRecommendationView` still renders after `AITravelChat` in recommendations view.
|
||||
- Build output is current: adapter-node manifest has 29 nodes (0-28) with no `/chat` page route.
|
||||
- **Adversarial**: 3 hypotheses tested (broken i18n keys, orphaned chat imports, missing embedded prop); all negative.
|
||||
- **MUTATION_ESCAPES**: 1/5 (minor: `embedded` prop boolean default not type-enforced; runtime safe).
|
||||
- **Reference**: See [Plan: AI travel agent](plans/ai-travel-agent-collections-integration.md)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Consolidated Review: AI Travel Agent + Collections + Provider Catalog
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Scope**: Full consolidated implementation in `collections-ai-agent` worktree — backend provider catalog endpoint (`GET /api/chat/providers/`), `CHAT_PROVIDER_CONFIG` with OpenCode Zen, dynamic provider selectors in `AITravelChat.svelte` and `settings/+page.svelte`, `ChatProviderCatalogEntry` type, chat embedding in Collections Recommendations, `/chat` route removal.
|
||||
- **Acceptance verification**:
|
||||
- AI chat embedded in Collections Recommendations: `collections/[id]/+page.svelte:1264` renders `<AITravelChat embedded={true} />` inside `currentView === 'recommendations'`.
|
||||
- No `/chat` route: `frontend/src/routes/chat/` directory absent, no Navbar chat/robot references.
|
||||
- All LiteLLM providers listed: `get_provider_catalog()` iterates `litellm.provider_list` (128 providers) + appends custom `CHAT_PROVIDER_CONFIG` entries.
|
||||
- OpenCode Zen supported: `opencode_zen` in `CHAT_PROVIDER_CONFIG` with `api_base=https://opencode.ai/zen/v1`, `default_model=openai/gpt-4o-mini`.
|
||||
- **Security**: `IsAuthenticated` on all chat endpoints, `get_queryset` scoped to `user=self.request.user`, no IDOR risk, API keys never exposed in catalog response, provider IDs validated before use.
|
||||
- **Prior findings confirmed**: WS1-F2 removal review, WS1 tester validation, LiteLLM provider research — all still valid and matching implementation.
|
||||
- **Reference**: See [Plan: AI travel agent](plans/ai-travel-agent-collections-integration.md)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Consolidated Tester Validation: collections-ai-agent worktree (Full Consolidation)
|
||||
- **Status**: PASS (Both Standard + Adversarial passes)
|
||||
- **Pipeline inputs validated**: Frontend build (bun run format+lint+check+build → PASS, 0 errors, 6 expected warnings); Backend system check (manage.py check → PASS, 0 issues).
|
||||
- **Key findings**:
|
||||
- All 12 i18n keys in `AITravelChat.svelte` confirmed present in `en.json`.
|
||||
- No `/chat` route file, no Navbar `/chat` href or `mdiRobotOutline` in any `.svelte` source.
|
||||
- Only `/chat` references are API proxy calls (`/api/chat/...`) — correct.
|
||||
- `ChatProviderCatalogEntry` type defined in `types.ts`; used correctly in both `AITravelChat.svelte` and `settings/+page.svelte`.
|
||||
- `opencode_zen` in `CHAT_PROVIDER_CONFIG` with `api_base`, appended by second loop in `get_provider_catalog()` since not in `litellm.provider_list`.
|
||||
- Provider validation in `send_message` view uses `is_chat_provider_available()` → 400 on invalid providers.
|
||||
- All agent tool functions scope DB queries to `user=user`.
|
||||
- `AITravelChat embedded={true}` correctly placed at `collections/[id]/+page.svelte:1264`.
|
||||
- **Adversarial**: 5 hypotheses tested:
|
||||
1. `None`/empty provider_id → `_normalize_provider_id` returns `""` → `is_chat_provider_available` returns `False` → 400 (safe).
|
||||
2. Provider not in `CHAT_PROVIDER_CONFIG` → rejected at `send_message` level → 400 (correct).
|
||||
3. `opencode_zen` not in `litellm.provider_list` → catalog second loop covers it (correct).
|
||||
4. `tool_iterations` never incremented → `MAX_TOOL_ITERATIONS` guard is dead code; infinite tool loop theoretically possible — **pre-existing bug**, same pattern in `main` branch, not introduced by this change.
|
||||
5. `api_base` exposed in catalog response — pre-existing non-exploitable information leakage noted in prior security review.
|
||||
- **MUTATION_ESCAPES**: 2/6 (tool_iterations dead guard; `embedded` boolean default not type-enforced — both pre-existing, runtime safe).
|
||||
- **Lesson checks**: All prior WS1 + security review findings confirmed; no contradictions.
|
||||
- **Reference**: See [Plan: AI travel agent](plans/ai-travel-agent-collections-integration.md)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Consolidated Security Review: collections-ai-agent worktree
|
||||
- **Verdict**: APPROVED (score 3)
|
||||
- **Lens**: Correctness + Security
|
||||
- **Scope**: Provider validation, API key handling, api_base/SSRF risks, auth/permission on provider catalog, /chat removal regressions.
|
||||
- **Findings**:
|
||||
- WARNING: `api_base` field exposed in provider catalog response to frontend despite frontend never using it (`llm_client.py:112,141`). Non-exploitable (server-defined constants), but unnecessary information leakage. (confidence: MEDIUM)
|
||||
- No CRITICAL issues found.
|
||||
- **Security verified**:
|
||||
- Provider IDs validated against `CHAT_PROVIDER_CONFIG` whitelist before any LLM call.
|
||||
- API keys Fernet-encrypted at rest, scoped to authenticated user, never returned in responses.
|
||||
- `api_base` is server-hardcoded only (no user input path) — no SSRF.
|
||||
- Provider catalog endpoint requires `IsAuthenticated`; returns same static catalog for all users.
|
||||
- Tool execution uses whitelist dispatch + allowed-kwargs filtering; all data queries scoped to `user=user`.
|
||||
- No IDOR: conversations filtered by user in queryset; tool operations filter/get by user.
|
||||
- **Prior reviews confirmed**: WS1-F2 APPROVED and WS1 tester PASS findings remain consistent in consolidated branch.
|
||||
- **Safe to proceed to testing**: Yes.
|
||||
- **Reference**: See [Plan: AI travel agent](plans/ai-travel-agent-collections-integration.md)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Critic Gate: OpenCode Zen Connection Error Fix
|
||||
- **Verdict**: APPROVED
|
||||
- **Scope**: Change default model from `openai/gpt-4o-mini` to `openai/gpt-5-nano`, improve error surfacing with sanitized messages, clean up `tool_choice`/`tools` kwargs — all in `backend/server/chat/llm_client.py`.
|
||||
- **Key guardrails**: (1) Error surfacing must NOT forward raw `exc.message` — map LiteLLM exception types to safe user-facing categories. (2) `@mdi/js` build failure is a missing `bun install`, not a baseline issue — must run `bun install` before validation. (3) WSGI→ASGI migration and `sync_to_async` ORM fixes are explicitly out of scope.
|
||||
- **Challenges accepted**: `gpt-5-nano` validity is research-based, not live-verified; mitigated by error surfacing fix making any remaining mismatch diagnosable.
|
||||
- **Files evaluated**: `backend/server/chat/llm_client.py:59-64,225-234,274-276`, `frontend/src/lib/components/AITravelChat.svelte:4`, `frontend/package.json:44`
|
||||
- **Reference**: See [Plan: OpenCode Zen connection error](plans/opencode-zen-connection-error.md#critic-gate)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Security Review: OpenCode Zen Connection Error + Model Selection
|
||||
- **Verdict**: APPROVED (score 3)
|
||||
- **Lens**: Security
|
||||
- **Scope**: Sanitized error handling, model override input validation, auth/permission integrity on send_message, localStorage usage for model preferences.
|
||||
- **Files reviewed**: `backend/server/chat/views.py`, `backend/server/chat/llm_client.py`, `frontend/src/lib/components/AITravelChat.svelte`, `backend/server/chat/agent_tools.py`, `backend/server/integrations/models.py`, `frontend/src/lib/types.ts`
|
||||
- **Findings**: No CRITICAL issues. 1 WARNING: pre-existing `api_base` exposure in provider catalog response (carried forward from prior review, decisions.md:103). Error surfacing uses class-based dispatch to hardcoded safe strings — critic guardrail confirmed satisfied. Model input used only as JSON field to `litellm.acompletion()` — no injection surface. Auth/IDOR protections unchanged. localStorage stores only `{provider_id: model_string}` — no secrets.
|
||||
- **Prior findings**: All confirmed consistent (api_base exposure, provider validation, IDOR scoping, error sanitization guardrail).
|
||||
- **Reference**: See [Plan: OpenCode Zen connection error](plans/opencode-zen-connection-error.md#reviewer-security-verdict)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Tester Validation: OpenCode Zen Model Selection + Error Surfacing
|
||||
- **Status**: PASS (Both Standard + Adversarial passes)
|
||||
- **Pipeline inputs validated**: `manage.py check` (PASS, 0 issues), `bun run check` (PASS, 0 errors, 6 pre-existing warnings), `bun run build` (Vite compilation PASS; EACCES on `build/` dir is pre-existing Docker permission issue), backend 30 tests (6 pre-existing failures matching documented baseline).
|
||||
- **Key targeted verifications**:
|
||||
- `opencode_zen` default model confirmed as `openai/gpt-5-nano` (changed from `gpt-4o-mini`).
|
||||
- `stream_chat_completion` accepts `model=None` with correct `None or default` fallback logic.
|
||||
- All empty/falsy model values (`""`, `" "`, `None`, `False`, `0`) produce `None` in views.py — default fallback engaged.
|
||||
- All 6 LiteLLM exception classes (`NotFoundError`, `AuthenticationError`, `RateLimitError`, `BadRequestError`, `Timeout`, `APIConnectionError`) produce sanitized hardcoded payloads — no raw exception text, `api_base`, or sensitive data leaked.
|
||||
- `_is_model_override_compatible` correctly bypasses prefix check for `api_base` gateways (opencode_zen) and enforces prefix for standard providers.
|
||||
- `tools`/`tool_choice` conditionally excluded from LiteLLM kwargs when `tools` is falsy.
|
||||
- i18n keys `chat.model_label` and `chat.model_placeholder` confirmed in `en.json`.
|
||||
- **Adversarial**: 9 hypotheses tested; all negative (no failures). Notable: `openai\n` normalizes to `openai` via `strip()` — correct and consistent with views.py.
|
||||
- **MUTATION_ESCAPES**: 0/5 — all 5 mutation checks detected by test cases.
|
||||
- **Pre-existing issues** (not introduced): `_merge_tool_call_delta` no upper bound on index (large index DoS); `tool_iterations` never incremented dead guard.
|
||||
- **Reference**: See [Plan: OpenCode Zen connection error](plans/opencode-zen-connection-error.md#tester-verdict-standard--adversarial)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Correctness Review: chat-loop-hardening
|
||||
- **Verdict**: APPROVED (score 6)
|
||||
- **Lens**: Correctness
|
||||
- **Scope**: Required-argument tool-error short-circuit in `send_message()` streaming loop, historical replay filtering in `_build_llm_messages()`, tool description improvements in `agent_tools.py`, and `tool_iterations` increment fix.
|
||||
- **Files reviewed**: `backend/server/chat/views/__init__.py`, `backend/server/chat/agent_tools.py`, `backend/server/chat/llm_client.py` (no hardening changes — confirmed stable)
|
||||
- **Acceptance criteria verification**:
|
||||
- AC1 (no repeated invalid-arg loops): ✓ — `_is_required_param_tool_error()` detects patterns via hardcoded set + regex. `return` exits generator after error event + `[DONE]`.
|
||||
- AC2 (error payloads not replayed): ✓ — short-circuit skips persistence; `_build_llm_messages()` filters historical tool-error messages.
|
||||
- AC3 (stream terminates coherently): ✓ — all 4 exit paths yield `[DONE]`.
|
||||
- AC4 (successful tool flows preserved): ✓ — new check is pass-through for non-error results.
|
||||
- **Findings**:
|
||||
- WARNING: [views/__init__.py:389-401] Multi-tool-call orphan state. When model returns N tool calls and call K (K>1) fails required-param validation, calls 1..K-1 are already persisted but the assistant message references all N tool_call IDs. Missing tool result causes LLM API errors on next conversation turn (caught by `_safe_error_payload`). (confidence: HIGH)
|
||||
- WARNING: [views/__init__.py:64-69] `_build_llm_messages` filters tool-role error messages but does not trim the corresponding assistant `tool_calls` array, creating the same orphan for historical messages. (confidence: HIGH)
|
||||
- **Suggestions**: `get_weather` error `"dates must be a non-empty list"` (agent_tools.py:601) does not match the `is/are required` regex. Mitigated by `MAX_TOOL_ITERATIONS` guard.
|
||||
- **Prior findings**: `tool_iterations` never-incremented bug (decisions.md:91,149) now fixed — line 349 increments correctly. Confirmed resolved.
|
||||
- **Reference**: See [Plan: chat-provider-fixes](plans/chat-provider-fixes.md#follow-up-fixes)
|
||||
- **Date**: 2026-03-09
|
||||
|
||||
## Correctness Review: OpenCode Zen Model Selection + Error Surfacing
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Lens**: Correctness
|
||||
- **Scope**: Model selection in chat composer, per-provider browser persistence, optional model override to backend, error category mapping, and OpenCode Zen default model fix across 4 files.
|
||||
- **Files reviewed**: `frontend/src/lib/components/AITravelChat.svelte`, `frontend/src/locales/en.json`, `backend/server/chat/views.py`, `backend/server/chat/llm_client.py`, `frontend/src/lib/types.ts`
|
||||
- **Findings**: No CRITICAL or WARNING issues. Two optional SUGGESTIONS (debounce localStorage writes on model input; add clarifying comment on `getattr` fallback pattern in `_safe_error_payload`).
|
||||
- **Verified paths**:
|
||||
- Model override end-to-end: frontend `trim() || undefined` → backend `strip() or None` → `stream_chat_completion(model=model)` → `completion_kwargs["model"] = model or default` — null/empty falls back correctly.
|
||||
- Per-provider persistence: `loadModelPref`/`saveModelPref` via `localStorage` with JSON parse error handling and SSR guards. Reactive blocks verified no infinite loop via `initializedModelProvider` sentinel.
|
||||
- Model-provider compatibility: `_is_model_override_compatible` skips validation for `api_base` gateways (OpenCode Zen), validates prefix for standard providers, allows bare model names.
|
||||
- Error surfacing: 6 LiteLLM exception types mapped to sanitized messages; no raw `exc.message` exposure; critic guardrail satisfied.
|
||||
- Tools/tool_choice: conditionally included only when `tools` is truthy; no `None` kwargs to LiteLLM.
|
||||
- i18n: `chat.model_label` and `chat.model_placeholder` confirmed in `en.json`.
|
||||
- Type safety: `ChatProviderCatalogEntry.default_model: string | null` handled with null-safe operators throughout.
|
||||
- **Prior findings**: Critic gate guardrails (decisions.md:117-124) all confirmed followed. `api_base` catalog exposure (decisions.md:103) unchanged/pre-existing. `tool_iterations` never-incremented bug (decisions.md:91) pre-existing, not affected.
|
||||
- **Reference**: See [Plan: OpenCode Zen connection error](plans/opencode-zen-connection-error.md#reviewer-correctness-verdict)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Critic Gate: Travel Agent Context + Models Follow-up
|
||||
- **Verdict**: APPROVED
|
||||
- **Scope**: Three follow-up fixes — F1 (expand opencode_zen model dropdown), F2 (collection-level context injection instead of single-location), F3 (itinerary-centric quick-action prompts + `.places`→`.results` bug fix).
|
||||
- **Key findings**: All source-level edit points verified current. F3a `.places`/`.results` key mismatch confirmed as critical rendering bug (place cards never display). F2 `values_list("name")` alone insufficient — need `city__name`/`country__name` for geocodable context. F1 model list should exclude reasoning models (`o1-preview`, `o1-mini`) pending tool-use compatibility verification.
|
||||
- **Execution order**: F1 → F2 → F3 (F3 depends on F2's `deriveCollectionDestination` changes).
|
||||
- **Files evaluated**: `backend/server/chat/views/__init__.py:144-168,417-418`, `backend/server/chat/llm_client.py:310-358`, `backend/server/chat/agent_tools.py:128,311-391`, `frontend/src/lib/components/AITravelChat.svelte:44,268,372-386,767-804`, `frontend/src/routes/collections/[id]/+page.svelte:259-280,1287-1294`, `backend/server/adventures/models.py:153-170,275-307`
|
||||
- **Reference**: See [Plan: Travel agent context + models](plans/travel-agent-context-and-models.md#critic-gate)
|
||||
- **Date**: 2026-03-09
|
||||
|
||||
## WS1 Configuration Infrastructure Backend Review
|
||||
- **Verdict**: CHANGES-REQUESTED (score 6)
|
||||
- **Lens**: Correctness + Security
|
||||
- **Scope**: WS1 backend implementation — `settings.py` env vars, `llm_client.py` fallback chain + catalog enhancement, `UserAISettings` model/serializer/ViewSet/migration, provider catalog user passthrough in `chat/views.py`.
|
||||
- **Findings**:
|
||||
- WARNING: Redundant instance-key fallback in `stream_chat_completion()` at `llm_client.py:328-331`. `get_llm_api_key()` (lines 262-266) already implements identical fallback logic. The duplicate creates divergence risk. (confidence: HIGH)
|
||||
- WARNING: `VOYAGE_AI_MODEL` env var defined at `settings.py:408` but never consumed by any code. Instance admins who set it will see no effect — model selection uses `CHAT_PROVIDER_CONFIG[provider]["default_model"]` or user override. False promise creates support burden. (confidence: HIGH)
|
||||
- **Security verified**:
|
||||
- Instance API key (`VOYAGE_AI_API_KEY`) only returned when provider matches `VOYAGE_AI_PROVIDER` — no cross-provider key leakage.
|
||||
- `UserAISettings` endpoint requires `IsAuthenticated`; queryset scoped to `request.user`; no IDOR.
|
||||
- Catalog `instance_configured`/`user_configured` booleans expose only presence (not key values) — safe.
|
||||
- N+1 avoided: single `values_list()` prefetch for user API keys in `get_provider_catalog()`.
|
||||
- Migration correctly depends on `0007_userapikey_userrecommendationpreferenceprofile` + swappable `AUTH_USER_MODEL`.
|
||||
- ViewSet follows exact pattern of existing `UserRecommendationPreferenceProfileViewSet` (singleton upsert via `_upserted_instance`).
|
||||
- **Suggestions**: (1) `ModelViewSet` exposes unneeded DELETE/PUT/PATCH — could restrict to Create+List mixins. (2) `preferred_model` max_length=100 may be tight for future model names.
|
||||
- **Next**: Remove redundant fallback lines 328-331 in `llm_client.py`. Wire `VOYAGE_AI_MODEL` into model resolution or remove it from settings.
|
||||
- **Prior findings**: `api_base` catalog exposure (decisions.md:103) still pre-existing. `_upserted_instance` thread-safety pattern consistent with existing code — pre-existing, not new.
|
||||
- **Reference**: See [Plan: AI travel agent redesign](plans/ai-travel-agent-redesign.md#ws1-configuration-infrastructure)
|
||||
- **Date**: 2026-03-08
|
||||
|
||||
## Correctness Review: suggestion-add-flow
|
||||
- **Verdict**: APPROVED (score 3)
|
||||
- **Lens**: Correctness
|
||||
- **Scope**: Day suggestions provider/model resolution, suggestion normalization, add-item flow creating location + itinerary entry.
|
||||
- **Files reviewed**: `backend/server/chat/views/day_suggestions.py`, `frontend/src/lib/components/collections/ItinerarySuggestionModal.svelte`, plus cross-referenced `llm_client.py`, `location_view.py`, `models.py`, `serializers.py`, `CollectionItineraryPlanner.svelte`.
|
||||
- **Findings**:
|
||||
- WARNING: Hardcoded `"gpt-4o-mini"` fallback at `day_suggestions.py:251` — if provider config has no `default_model` and no model is resolved, this falls back to an OpenAI model string even for non-OpenAI providers. Contradicts "no hardcoded OpenAI" acceptance criterion at the deep fallback layer. (confidence: HIGH)
|
||||
- No CRITICAL issues.
|
||||
- **Verified paths**:
|
||||
- Provider/model resolution follows correct precedence: request → UserAISettings → VOYAGE_AI_PROVIDER/MODEL → provider config default. `VOYAGE_AI_MODEL` is now consumed (resolves prior WARNING from decisions.md:186).
|
||||
- Add-item flow: `handleAddSuggestion` → `buildLocationPayload` → POST `/api/locations/` (name/description/location/rating/collections/is_public) → `dispatch('addItem', {type, itemId, updateDate})` → parent `addItineraryItemForObject`. Event shape matches parent handler exactly.
|
||||
- Normalization: `normalizeSuggestionItem` handles LLM response variants (title/place_name/venue, summary/details, address/neighborhood) defensively. Items without resolvable name are dropped. `normalizeRating` safely extracts numeric values. Not overly broad.
|
||||
- Auth: `IsAuthenticated` + collection owner/shared_with check. CSRF handled by API proxy. No IDOR.
|
||||
- **Next**: Replace `or "gpt-4o-mini"` on line 251 with a raise or log if no model resolved, removing the last OpenAI-specific hardcoding.
|
||||
- **Reference**: See [Plan: Chat provider fixes](plans/chat-provider-fixes.md#suggestion-add-flow)
|
||||
- **Date**: 2026-03-09
|
||||
|
||||
## Correctness Review: default-ai-settings
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Lens**: Correctness + Security
|
||||
- **Scope**: DB-backed default AI provider/model settings — Settings UI save/reload, Chat component initialization from saved defaults, backend send_message fallback, localStorage override prevention.
|
||||
- **Files reviewed**: `frontend/src/routes/settings/+page.server.ts` (lines 112-121, 146), `frontend/src/routes/settings/+page.svelte` (lines 50-173, 237-239, 1676-1733), `frontend/src/lib/components/AITravelChat.svelte` (lines 82-134, 199-212), `backend/server/chat/views/__init__.py` (lines 183-216), `backend/server/integrations/views/ai_settings_view.py`, `backend/server/integrations/serializers.py` (lines 104-114), `backend/server/integrations/models.py` (lines 129-146), `frontend/src/lib/types.ts`, `frontend/src/locales/en.json`.
|
||||
- **Acceptance criteria**:
|
||||
1. ✅ Settings UI save/reload: server-side loads `aiSettings` (page.server.ts:112-121), frontend initializes with normalization (page.svelte:50-51), saves via POST with re-validation (page.svelte:135-173), template renders provider/model selects (page.svelte:1676-1733).
|
||||
2. ✅ Chat initializes from saved defaults: `loadUserAISettings()` fetches from DB (AITravelChat:87-107), `applyInitialDefaults()` applies with validation (AITravelChat:109-134).
|
||||
3. ✅ localStorage doesn't override DB: `saveModelPref()` writes only (AITravelChat:199-212); old `loadModelPref()` reader removed.
|
||||
4. ✅ Backend fallback safe: requested → preferred (if available) → "openai" (views/__init__.py:195-201); model gated by `provider == preferred_provider` (views/__init__.py:204).
|
||||
- **Verified paths**:
|
||||
- Provider normalization consistent (`.trim().toLowerCase()`) across settings, chat, backend. Model normalization (`.trim()` only) correct — model IDs are case-sensitive.
|
||||
- Upsert semantics correct: `perform_create` checks for existing, updates in place. Returns 200 OK; frontend checks `res.ok`. Matches `OneToOneField` constraint.
|
||||
- CSRF: transparent via API proxy. Auth: `IsAuthenticated` + user-scoped queryset. No IDOR.
|
||||
- Empty/null edge cases: `preferred_model: defaultAiModel || null` sends null for empty. Backend `or ""` normalization handles None. Robust.
|
||||
- Stale provider/model: validated against configured providers (page.svelte:119) and loaded models (page.svelte:125-127); falls back correctly.
|
||||
- Async ordering: sequential awaits correct (loadProviderCatalog → initializeDefaultAiSettings; Promise.all → applyInitialDefaults).
|
||||
- Race prevention: `initialDefaultsApplied` flag, `loadedModelsForProvider` guard.
|
||||
- Contract: serializer fields match frontend `UserAISettings` type. POST body matches serializer.
|
||||
- **No CRITICAL or WARNING findings.**
|
||||
- **Prior findings confirmed**: `preferred_model` max_length=100 and `ModelViewSet` excess methods (decisions.md:212) remain pre-existing, not introduced here.
|
||||
- **Reference**: See [Plan: Chat provider fixes](plans/chat-provider-fixes.md#default-ai-settings)
|
||||
- **Date**: 2026-03-09
|
||||
|
||||
## Re-Review: suggestion-add-flow (OpenAI fallback removal)
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Lens**: Correctness (scoped re-review)
|
||||
- **Scope**: Verification that the WARNING from decisions.md:224 (hardcoded `or "gpt-4o-mini"` fallback in `_get_suggestions_from_llm`) is resolved, and no new issues introduced.
|
||||
- **Original finding resolved**: ✅ — `day_suggestions.py:251` now reads `resolved_model = model or provider_config.get("default_model")` with no OpenAI fallback. Lines 252-253 raise `ValueError("No model configured for provider")` if `resolved_model` is falsy. Grep confirms zero `gpt-4o-mini` occurrences in `backend/server/chat/`.
|
||||
- **No new issues introduced**:
|
||||
- `ValueError` at line 253 is safely caught by `except Exception` at line 87, returning generic 500 response.
|
||||
- `CHAT_PROVIDER_CONFIG.get(provider, {})` at line 250 handles `None` provider safely (returns `{}`).
|
||||
- Double-resolution of `provider_config` (once in `_resolve_provider_and_model:228`, again in `_get_suggestions_from_llm:250`) is redundant but harmless — defensive fallback consistent with streaming chat path.
|
||||
- Provider resolution chain at lines 200-241 intact: request → user settings → instance settings → OpenAI availability check. Model gated by `provider == preferred_provider` (line 237) prevents cross-provider model mismatches.
|
||||
- **Reference**: See [Plan: Chat provider fixes](plans/chat-provider-fixes.md#suggestion-add-flow), prior finding at decisions.md:224
|
||||
- **Date**: 2026-03-09
|
||||
|
||||
## Re-Review: chat-loop-hardening multi-tool-call orphan fix
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Lens**: Correctness (targeted re-review)
|
||||
- **Scope**: Fix for multi-tool-call partial failure orphaned context — `_build_llm_messages()` trimming and `send_message()` successful-prefix persistence.
|
||||
- **Original findings status**:
|
||||
- WARNING (decisions.md:164): Multi-tool-call orphan in streaming loop — **RESOLVED**. `send_message()` now accumulates `successful_tool_calls`/`successful_tool_messages` and persists only those on required-arg failure (lines 365-426). First-call failure correctly omits `tool_calls` from assistant message entirely (line 395 guard).
|
||||
- WARNING (decisions.md:165): `_build_llm_messages` assistant `tool_calls` not trimmed — **RESOLVED**. Lines 59-65 build `valid_tool_call_ids` from non-error tool messages; lines 85-91 filter assistant `tool_calls` to only matching IDs; empty result omits `tool_calls` key entirely.
|
||||
- **New issues introduced**: None. Defensive null handling (`(tool_call or {}).get("id")`) correct. No duplicate persistence risk (failure path returns, success path continues). In-memory `current_messages` and persisted messages stay consistent.
|
||||
- **Reference**: See [Plan: Chat provider fixes](plans/chat-provider-fixes.md#chat-loop-hardening)
|
||||
- **Date**: 2026-03-09
|
||||
|
||||
## Re-Review: normalize_gateway_model + day-suggestions error handling
|
||||
- **Verdict**: APPROVED (score 3)
|
||||
- **Lens**: Correctness
|
||||
- **Scope**: `normalize_gateway_model()` helper in `llm_client.py`, its integration in both `stream_chat_completion()` and `DaySuggestionsView._get_suggestions_from_llm()`, `_safe_error_payload` adoption in day suggestions, `temperature` kwarg removal, and exception logging addition.
|
||||
- **Changes verified**:
|
||||
- `normalize_gateway_model` correctly prefixes bare `opencode_zen` model IDs with `openai/`, passes through all other models, and returns `None` for empty/None input.
|
||||
- `stream_chat_completion:420` calls `normalize_gateway_model` after model resolution but before `supports_function_calling` check — correct ordering.
|
||||
- `day_suggestions.py:266-271` normalizes resolved model and guards against `None` with `ValueError` — resolves prior WARNING about hardcoded `gpt-4o-mini` fallback (decisions.md:224).
|
||||
- `day_suggestions.py:93-106` uses `_safe_error_payload` with status-code mapping dict — LiteLLM exceptions get appropriate HTTP codes (400/401/429/503), `ValueError` falls through to generic 500.
|
||||
- `temperature` kwarg confirmed absent from `completion_kwargs` — resolves `UnsupportedParamsError` on `gpt-5-nano`.
|
||||
- `logger.exception` at line 94 ensures full tracebacks for debugging.
|
||||
- **Findings**:
|
||||
- WARNING: `stream_chat_completion:420` has no `None` guard on `normalize_gateway_model` return, unlike `day_suggestions.py:270-271`. Currently unreachable (resolution chain always yields non-empty model from `CHAT_PROVIDER_CONFIG`), but defensive guard would make contract explicit. (confidence: LOW)
|
||||
- **Prior findings**: hardcoded `gpt-4o-mini` WARNING (decisions.md:224) confirmed resolved. `_safe_error_payload` sanitization guardrail (decisions.md:120) confirmed satisfied.
|
||||
- **Reference**: See [Plan: Chat provider fixes](plans/chat-provider-fixes.md#suggestion-add-flow)
|
||||
- **Date**: 2026-03-09
|
||||
0
.memory/gates/.gitkeep
Normal file
0
.memory/gates/.gitkeep
Normal file
13
.memory/knowledge.md
Normal file
13
.memory/knowledge.md
Normal file
@@ -0,0 +1,13 @@
|
||||
# DEPRECATED — Migrated to nested structure (2026-03-09)
|
||||
|
||||
This file has been superseded. Content has been migrated to:
|
||||
|
||||
- **[system.md](system.md)** — Project overview (one paragraph)
|
||||
- **[knowledge/overview.md](knowledge/overview.md)** — Architecture, services, auth, file locations
|
||||
- **[knowledge/tech-stack.md](knowledge/tech-stack.md)** — Stack, dev commands, environment, known issues
|
||||
- **[knowledge/conventions.md](knowledge/conventions.md)** — Coding conventions and workflow rules
|
||||
- **[knowledge/patterns/chat-and-llm.md](knowledge/patterns/chat-and-llm.md)** — Chat/LLM patterns, agent tools, error mapping, OpenCode Zen
|
||||
- **[knowledge/domain/collections-and-sharing.md](knowledge/domain/collections-and-sharing.md)** — Collection sharing, itinerary, user preferences
|
||||
- **[knowledge/domain/ai-configuration.md](knowledge/domain/ai-configuration.md)** — WS1 config infrastructure, frontend gaps
|
||||
|
||||
See [manifest.yaml](manifest.yaml) for the full index.
|
||||
20
.memory/knowledge/conventions.md
Normal file
20
.memory/knowledge/conventions.md
Normal file
@@ -0,0 +1,20 @@
|
||||
# Coding Conventions & Patterns
|
||||
|
||||
## Frontend Patterns
|
||||
- **i18n**: Use `$t('key')` from `svelte-i18n`; add keys to locale files
|
||||
- **API calls**: Always use `credentials: 'include'` and `X-CSRFToken` header
|
||||
- **Svelte reactivity**: Reassign to trigger: `items[i] = updated; items = items`
|
||||
- **Styling**: DaisyUI semantic classes + Tailwind utilities; use `bg-primary`, `text-base-content` not raw colors
|
||||
- **Maps**: `svelte-maplibre` with MapLibre GL; GeoJSON data
|
||||
|
||||
## Backend Patterns
|
||||
- **Views**: DRF `ModelViewSet` subclasses; `get_queryset()` filters by `user=self.request.user`
|
||||
- **Money**: `djmoney` MoneyField
|
||||
- **Geo**: PostGIS via `django-geojson`
|
||||
- **Chat providers**: Dynamic catalog from `GET /api/chat/providers/`; configured in `CHAT_PROVIDER_CONFIG`
|
||||
|
||||
## Workflow Conventions
|
||||
- Do **not** attempt to fix known test/configuration issues as part of feature work
|
||||
- Use `bun` for frontend commands, `uv` for local Python tooling where applicable
|
||||
- Commit and merge completed feature branches promptly once validation passes (avoid leaving finished work unmerged)
|
||||
- See [decisions.md](../decisions.md#workflow-preference-commit--merge-when-done) for rationale
|
||||
44
.memory/knowledge/domain/ai-configuration.md
Normal file
44
.memory/knowledge/domain/ai-configuration.md
Normal file
@@ -0,0 +1,44 @@
|
||||
# AI Configuration Domain
|
||||
|
||||
## WS1 Configuration Infrastructure
|
||||
|
||||
### WS1-F1: Instance-level env vars and key fallback
|
||||
- `settings.py`: `VOYAGE_AI_PROVIDER`, `VOYAGE_AI_MODEL`, `VOYAGE_AI_API_KEY`
|
||||
- `get_llm_api_key(user, provider)` falls back to instance key only when provider matches `VOYAGE_AI_PROVIDER`
|
||||
- Fallback chain: user key -> matching-provider instance key -> error
|
||||
- See [tech-stack.md](../tech-stack.md#server-side-env-vars-from-settingspy), [decisions.md](../../decisions.md#ws1-configuration-infrastructure-backend-review)
|
||||
|
||||
### WS1-F2: UserAISettings model
|
||||
- `integrations/models.py`: `UserAISettings` (OneToOneField to user) with `preferred_provider` and `preferred_model`
|
||||
- Endpoint: `/api/integrations/ai-settings/` (upsert pattern)
|
||||
- Migration: `0008_useraisettings.py`
|
||||
|
||||
### WS1-F3: Provider catalog enhancement
|
||||
- `get_provider_catalog(user=None)` adds `instance_configured` and `user_configured` booleans
|
||||
- User API keys prefetched once per request (no N+1)
|
||||
- `ChatProviderCatalogEntry` TypeScript type updated with both fields
|
||||
|
||||
### Frontend Provider Selection (Fixed)
|
||||
- No longer hardcodes `selectedProvider = 'openai'`; auto-selects first usable provider
|
||||
- Filtered to configured+usable entries only (`available_for_chat && (user_configured || instance_configured)`)
|
||||
- Warning alert + Settings link when no providers configured
|
||||
- Model selection uses dropdown from `GET /api/chat/providers/{provider}/models/`
|
||||
|
||||
## Known Frontend Gaps
|
||||
|
||||
### Root Cause of User-Facing LLM Errors
|
||||
Three compounding issues (all resolved):
|
||||
1. ~~Hardcoded `'openai'` default~~ (fixed: auto-selects first usable)
|
||||
2. ~~No provider status feedback~~ (fixed: catalog fields consumed)
|
||||
3. ~~`UserAISettings.preferred_provider` never loaded~~ (fixed: Settings UI saves/loads DB defaults; chat initializes from saved prefs)
|
||||
4. `FIELD_ENCRYPTION_KEY` not set disables key storage (env-dependent)
|
||||
5. ~~TypeScript type missing fields~~ (fixed)
|
||||
|
||||
## Key Edit Reference Points
|
||||
| Feature | File | Location |
|
||||
|---|---|---|
|
||||
| AI env vars | `backend/server/main/settings.py` | after `FIELD_ENCRYPTION_KEY` |
|
||||
| Fallback key | `backend/server/chat/llm_client.py` | `get_llm_api_key()` |
|
||||
| UserAISettings model | `backend/server/integrations/models.py` | after UserAPIKey |
|
||||
| Catalog user flags | `backend/server/chat/llm_client.py` | `get_provider_catalog()` |
|
||||
| Provider view | `backend/server/chat/views/__init__.py` | `ChatProviderCatalogViewSet` |
|
||||
62
.memory/knowledge/domain/collections-and-sharing.md
Normal file
62
.memory/knowledge/domain/collections-and-sharing.md
Normal file
@@ -0,0 +1,62 @@
|
||||
# Collections & Sharing Domain
|
||||
|
||||
## Collection Sharing Architecture
|
||||
|
||||
### Data Model
|
||||
- `Collection.shared_with` - `ManyToManyField(User, related_name='shared_with', blank=True)` - the primary access grant
|
||||
- `CollectionInvite` - staging table for pending invites: `(collection FK, invited_user FK, unique_together)`; prevents self-invite and double-invite
|
||||
- Both models in `backend/server/adventures/models.py`
|
||||
|
||||
### Access Flow (Invite -> Accept -> Membership)
|
||||
1. Owner calls `POST /api/collections/{id}/share/{uuid}/` -> creates `CollectionInvite`
|
||||
2. Invited user sees pending invites via `GET /api/collections/invites/`
|
||||
3. Invited user calls `POST /api/collections/{id}/accept-invite/` -> adds to `shared_with`, deletes invite
|
||||
4. Owner revokes: `POST /api/collections/{id}/revoke-invite/{uuid}/`
|
||||
5. User declines: `POST /api/collections/{id}/decline-invite/`
|
||||
6. Owner removes: `POST /api/collections/{id}/unshare/{uuid}/` - removes user's locations from collection
|
||||
7. User self-removes: `POST /api/collections/{id}/leave/` - removes their locations
|
||||
|
||||
### Permission Classes
|
||||
- `CollectionShared` - full access for owner and `shared_with` members; invite actions for invitees; anonymous read for `is_public`
|
||||
- `IsOwnerOrSharedWithFullAccess` - child-object viewsets; traverses `obj.collections`/`obj.collection` to check `shared_with`
|
||||
- `ContentImagePermission` - delegates to `IsOwnerOrSharedWithFullAccess` on `content_object`
|
||||
|
||||
### Key Design Constraints
|
||||
- No role differentiation - all shared users have same write access
|
||||
- On unshare/leave, departing user's locations are removed from collection (not deleted)
|
||||
- `duplicate` action creates a private copy with no `shared_with` transfer
|
||||
|
||||
## Itinerary Architecture
|
||||
|
||||
### Primary Component
|
||||
`CollectionItineraryPlanner.svelte` (~3818 lines) - monolith rendering the entire itinerary view and all modals.
|
||||
|
||||
### The "Add" Button
|
||||
DaisyUI dropdown at bottom of each day card with "Link existing item" and "Create new" submenu (Location, Lodging, Transportation, Note, Checklist).
|
||||
|
||||
### Day Suggestions Modal (WS3)
|
||||
- Component: `ItinerarySuggestionModal.svelte`
|
||||
- Props: `collection`, `user`, `targetDate`, `displayDate`
|
||||
- Events: `close`, `addItem { type: 'location', itemId, updateDate }`
|
||||
- 3-step UX: category selection -> filters -> results from `POST /api/chat/suggestions/day/`
|
||||
|
||||
## User Recommendation Preference Profile
|
||||
Backend-only feature: model, API, and system-prompt integration exist, but **no frontend UI** built yet.
|
||||
|
||||
### Auto-learned preference updates
|
||||
- `backend/server/integrations/utils/auto_profile.py` derives profile from user history
|
||||
- Fields: `interests` (from activities + locations), `trip_style` (from lodging), `notes` (frequently visited regions)
|
||||
- `ChatViewSet.send_message()` invokes `update_auto_preference_profile(request.user)` at method start
|
||||
|
||||
### Model
|
||||
`UserRecommendationPreferenceProfile` (OneToOne -> `CustomUser`): `cuisines`, `interests` (JSONField), `trip_style`, `notes`, timestamps.
|
||||
|
||||
### System Prompt Integration
|
||||
- Single-user: labeled as auto-inferred with structured markdown section
|
||||
- Shared collections: `get_aggregated_preferences(collection)` appends per-participant preferences
|
||||
- Missing profiles skipped gracefully
|
||||
|
||||
### Frontend Gap
|
||||
- No settings tab for manual preference editing
|
||||
- TypeScript type available as `UserRecommendationPreferenceProfile` in `src/lib/types.ts`
|
||||
- See [Plan: AI travel agent redesign](../../plans/ai-travel-agent-redesign.md#ws2-user-preference-learning)
|
||||
40
.memory/knowledge/overview.md
Normal file
40
.memory/knowledge/overview.md
Normal file
@@ -0,0 +1,40 @@
|
||||
# Architecture Overview
|
||||
|
||||
## API Proxy Pattern
|
||||
Frontend never calls Django directly. All API calls go through `src/routes/api/[...path]/+server.ts` → Django at `http://server:8000`. Frontend uses relative URLs like `/api/locations/`.
|
||||
|
||||
## AI Chat (Collections → Recommendations)
|
||||
- AI travel chat is embedded in **Collections → Recommendations** via `AITravelChat.svelte` (no standalone `/chat` route).
|
||||
- Provider selector loads dynamically from `GET /api/chat/providers/` (backed by `litellm.provider_list` + `CHAT_PROVIDER_CONFIG` in `backend/server/chat/llm_client.py`).
|
||||
- Supported configured providers: OpenAI, Anthropic, Gemini, Ollama, Groq, Mistral, GitHub Models, OpenRouter, OpenCode Zen (`opencode_zen`, `api_base=https://opencode.ai/zen/v1`, default model `openai/gpt-5-nano`).
|
||||
- Chat conversations stream via SSE through `/api/chat/conversations/`.
|
||||
- `ChatViewSet.send_message()` accepts optional context fields (`collection_id`, `collection_name`, `start_date`, `end_date`, `destination`) and appends a `## Trip Context` section to the system prompt when provided. When a `collection_id` is present, also injects `Itinerary stops:` from `collection.locations` (up to 8 unique stops). See [patterns/chat-and-llm.md](patterns/chat-and-llm.md#multi-stop-context-derivation).
|
||||
- Chat composer supports per-provider model override (persisted in browser `localStorage` key `voyage_chat_model_prefs`). DB-saved default provider/model (`UserAISettings`) is authoritative on initialization; localStorage is write-only sync target. Backend `send_message` accepts optional `model` param; falls back to DB defaults → instance defaults → `"openai"`.
|
||||
- Invalid required-argument tool calls are detected and short-circuited: stream terminates with `tool_validation_error` SSE event + `[DONE]` and invalid tool results are not replayed into conversation history. See [patterns/chat-and-llm.md](patterns/chat-and-llm.md#tool-call-error-handling-chat-loop-hardening).
|
||||
- LiteLLM errors mapped to sanitized user-safe messages via `_safe_error_payload()` (never exposes raw exception text). See [patterns/chat-and-llm.md](patterns/chat-and-llm.md#sanitized-llm-error-mapping).
|
||||
- Frontend type: `ChatProviderCatalogEntry` in `src/lib/types.ts`.
|
||||
- Reference: [Plan: AI travel agent](../plans/ai-travel-agent-collections-integration.md), [Plan: AI travel agent redesign — WS4](../plans/ai-travel-agent-redesign.md#ws4-collection-level-chat-improvements)
|
||||
|
||||
## Services (Docker Compose)
|
||||
| Service | Container | Port |
|
||||
|---------|-----------|------|
|
||||
| Frontend | `web` | :8015 |
|
||||
| Backend | `server` | :8016 |
|
||||
| Database | `db` | :5432 |
|
||||
| Cache | `cache` | internal |
|
||||
|
||||
## Authentication
|
||||
Session-based via `django-allauth`. CSRF tokens from `/auth/csrf/`, passed as `X-CSRFToken` header. Mobile clients use `X-Session-Token` header.
|
||||
|
||||
## Key File Locations
|
||||
- Frontend source: `frontend/src/`
|
||||
- Backend source: `backend/server/`
|
||||
- Django apps: `adventures/`, `users/`, `worldtravel/`, `integrations/`, `achievements/`, `chat/`
|
||||
- Chat LLM config: `backend/server/chat/llm_client.py` (`CHAT_PROVIDER_CONFIG`)
|
||||
- AI Chat component: `frontend/src/lib/components/AITravelChat.svelte`
|
||||
- Types: `frontend/src/lib/types.ts`
|
||||
- API proxy: `frontend/src/routes/api/[...path]/+server.ts`
|
||||
- i18n: `frontend/src/locales/`
|
||||
- Docker config: `docker-compose.yml`, `docker-compose.dev.yml`
|
||||
- CI/CD: `.github/workflows/`
|
||||
- Public docs: `documentation/` (VitePress)
|
||||
133
.memory/knowledge/patterns/chat-and-llm.md
Normal file
133
.memory/knowledge/patterns/chat-and-llm.md
Normal file
@@ -0,0 +1,133 @@
|
||||
# Chat & LLM Patterns
|
||||
|
||||
## Default AI Settings & Model Override
|
||||
|
||||
### DB-backed defaults (authoritative)
|
||||
- **Model**: `UserAISettings` (OneToOneField, `integrations/models.py`) stores `preferred_provider` and `preferred_model` per user.
|
||||
- **Endpoint**: `GET/POST /api/integrations/ai-settings/` — upsert pattern (OneToOneField + `perform_create` update-or-create).
|
||||
- **Settings UI**: `settings/+page.svelte` loads/saves default provider and model. Provider dropdown filtered to configured providers; model dropdown from `GET /api/chat/providers/{provider}/models/`.
|
||||
- **Chat initialization**: `AITravelChat.svelte` `loadUserAISettings()` fetches saved defaults on mount and applies them as authoritative initial provider/model. Direction is DB → localStorage (not reverse).
|
||||
- **Backend fallback precedence** in `send_message()`:
|
||||
1. Explicit request payload (`provider`, `model`)
|
||||
2. `UserAISettings.preferred_provider` / `preferred_model` (only when provider matches)
|
||||
3. Instance defaults (`VOYAGE_AI_PROVIDER`, `VOYAGE_AI_MODEL`)
|
||||
4. `"openai"` hardcoded fallback
|
||||
- **Cross-provider guard**: `preferred_model` only applied when resolved provider == `preferred_provider` (prevents e.g. `gpt-5-nano` leaking to Anthropic).
|
||||
|
||||
### Per-session model override (browser-only)
|
||||
- **Frontend**: model dropdown next to provider selector, populated by `GET /api/chat/providers/{provider}/models/`.
|
||||
- **Persistence**: `localStorage` key `voyage_chat_model_prefs` — written on selection, but never overrides DB defaults on initialization (DB wins).
|
||||
- **Compatibility guard**: `_is_model_override_compatible()` validates model prefix for standard providers; skips check for `api_base` gateways (e.g. `opencode_zen`).
|
||||
- **i18n keys**: `chat.model_label`, `chat.model_placeholder`, `default_ai_settings_title`, `default_ai_settings_desc`, `default_ai_save`, `default_ai_settings_saved`, `default_ai_settings_error`, `default_ai_provider_required`, `default_ai_no_providers`.
|
||||
|
||||
## Sanitized LLM Error Mapping
|
||||
- `_safe_error_payload()` in `backend/server/chat/llm_client.py` maps LiteLLM exception classes to hardcoded user-safe strings with `error_category` field.
|
||||
- Exception classes mapped: `NotFoundError` -> "model not found", `AuthenticationError` -> "authentication", `RateLimitError` -> "rate limit", `BadRequestError` -> "bad request", `Timeout` -> "timeout", `APIConnectionError` -> "connection".
|
||||
- Raw `exc.message`, `str(exc)`, and `exc.args` are **never** forwarded to the client. Server-side `logger.exception()` logs full details.
|
||||
- Uses `getattr(litellm.exceptions, "ClassName", tuple())` for resilient class lookup.
|
||||
- Security guardrail from critic gate: [decisions.md](../../decisions.md#critic-gate-opencode-zen-connection-error-fix).
|
||||
|
||||
## Tool Call Error Handling (Chat Loop Hardening)
|
||||
- **Required-arg detection**: `_is_required_param_tool_error()` matches tool results containing `"is required"` / `"are required"` patterns via regex. Detects errors like `"location is required"`, `"query is required"`, `"collection_id, name, latitude, and longitude are required"`.
|
||||
- **Short-circuit on invalid tool calls**: When a tool call returns a required-param error, `send_message()` yields an SSE error event with `error_category: "tool_validation_error"` and immediately terminates the stream with `[DONE]`. No further LLM turns are attempted.
|
||||
- **Persistence skip**: Invalid tool call results (and the tool_call entry itself) are NOT persisted to the database, preventing replay into future conversation turns.
|
||||
- **Historical cleanup**: `_build_llm_messages()` filters persisted tool-role messages containing required-param errors AND trims the corresponding assistant `tool_calls` array to only IDs that have non-filtered tool messages. Empty `tool_calls` arrays are omitted entirely.
|
||||
- **Multi-tool partial success**: When model returns N tool calls and call K fails, calls 1..K-1 (the successful prefix) are persisted normally. Only the failed call and subsequent calls are dropped.
|
||||
- **Tool iteration guard**: `MAX_TOOL_ITERATIONS = 10` with correctly-incremented counter prevents unbounded loops from other error classes (e.g. `"dates must be a non-empty list"` from `get_weather` does NOT match the required-arg regex but is bounded by iteration limit).
|
||||
- **Known gap**: `get_weather` error `"dates must be a non-empty list"` does not trigger the short-circuit — mitigated by `MAX_TOOL_ITERATIONS`.
|
||||
|
||||
## OpenCode Zen Provider
|
||||
- Provider ID: `opencode_zen`
|
||||
- `api_base`: `https://opencode.ai/zen/v1`
|
||||
- Default model: `openai/gpt-5-nano` (changed from `openai/gpt-4o-mini` which was invalid on Zen)
|
||||
- GPT models on Zen use `/chat/completions` endpoint (OpenAI-compatible)
|
||||
- LiteLLM `openai/` prefix routes through OpenAI client to the custom `api_base`
|
||||
- Model dropdown exposes 5 curated options (reasoning models excluded). See [decisions.md](../../decisions.md#critic-gate-travel-agent-context--models-follow-up).
|
||||
|
||||
## Multi-Stop Context Derivation
|
||||
Chat context derives from the **full collection itinerary**, not just the first location.
|
||||
|
||||
### Frontend - `deriveCollectionDestination()`
|
||||
- Located in `frontend/src/routes/collections/[id]/+page.svelte`.
|
||||
- Extracts unique city/country pairs from `collection.locations`.
|
||||
- Capped at 4 stops, semicolon-joined, with `+N more` overflow suffix.
|
||||
- Passed to `AITravelChat` as `destination` prop.
|
||||
|
||||
### Backend - `send_message()` itinerary enrichment
|
||||
- `backend/server/chat/views/__init__.py` `send_message()` reads `collection.locations` and injects `Itinerary stops:` into the system prompt `## Trip Context` section.
|
||||
- Up to 8 unique stops; deduplication and blank-entry filtering applied.
|
||||
|
||||
### System prompt - trip-level reasoning
|
||||
- `get_system_prompt()` includes guidance to treat collection chats as itinerary-wide and call `get_trip_details` before `search_places`.
|
||||
|
||||
## Itinerary-Centric Quick Prompts
|
||||
- Quick-action buttons use `promptTripContext` (reactive: `collectionName || destination || ''`) instead of raw `destination`.
|
||||
- Guard changed from `{#if destination}` to `{#if promptTripContext}`.
|
||||
- Prompt wording uses `across my ${promptTripContext} itinerary?`.
|
||||
|
||||
## search_places Tool Output Key Convention
|
||||
- Backend `agent_tools.py` `search_places()` returns `{"location": ..., "category": ..., "results": [...]}`.
|
||||
- Frontend must use `.results` key (not `.places`).
|
||||
- **Historical bug**: Prior code used `.places` causing place cards to never render. Fixed 2026-03-09.
|
||||
|
||||
## Agent Tools Architecture
|
||||
|
||||
### Registered Tools
|
||||
| Tool name | Purpose | Required params |
|
||||
|---|---|---|
|
||||
| `search_places` | Nominatim geocode -> Overpass PoI search | `location` |
|
||||
| `web_search` | DuckDuckGo web search for current travel info | `query` |
|
||||
| `list_trips` | List user's collections | (none) |
|
||||
| `get_trip_details` | Full collection detail with itinerary | `collection_id` |
|
||||
| `add_to_itinerary` | Create Location + CollectionItineraryItem | `collection_id`, `name`, `lat`, `lon` |
|
||||
| `get_weather` | Open-Meteo archive + forecast | `latitude`, `longitude`, `dates` |
|
||||
|
||||
### Registry pattern
|
||||
- `@agent_tool(name, description, parameters)` decorator registers function references and generates OpenAI/LiteLLM-compatible tool schemas.
|
||||
- `execute_tool(tool_name, user, **kwargs)` resolves from registry and filters kwargs via `inspect.signature(...)`.
|
||||
- Extensibility: adding a new tool only requires defining a decorated function.
|
||||
|
||||
### Function signature convention
|
||||
All tool functions: `def tool_name(user, **kwargs) -> dict`. Return `{"error": "..."}` on failure; never raise.
|
||||
|
||||
### Web Search Tool
|
||||
- Uses `duckduckgo_search.DDGS().text(..., max_results=5)`.
|
||||
- Error handling includes import fallback, rate-limit guard, and generic failure logging.
|
||||
- Dependency: `duckduckgo-search>=4.0.0` in `backend/server/requirements.txt`.
|
||||
|
||||
## Backend Chat Endpoint Architecture
|
||||
|
||||
### URL Routing
|
||||
- `backend/server/main/urls.py`: `path("api/chat/", include("chat.urls"))`
|
||||
- `backend/server/chat/urls.py`: DRF `DefaultRouter` registers `conversations/` -> `ChatViewSet`, `providers/` -> `ChatProviderCatalogViewSet`
|
||||
- Manual paths: `POST /api/chat/suggestions/day/` -> `DaySuggestionsView`, `GET /api/chat/capabilities/` -> `CapabilitiesView`
|
||||
|
||||
### ChatViewSet Pattern
|
||||
- All actions: `permission_classes = [IsAuthenticated]`
|
||||
- Streaming response uses `StreamingHttpResponse(content_type="text/event-stream")`
|
||||
- SSE chunk format: `data: {json}\n\n`; terminal `data: [DONE]\n\n`
|
||||
- Tool loop: up to `MAX_TOOL_ITERATIONS = 10` rounds
|
||||
|
||||
### Day Suggestions Endpoint
|
||||
- `POST /api/chat/suggestions/day/` via `chat/views/day_suggestions.py`
|
||||
- Non-streaming JSON response
|
||||
- Inputs: `collection_id`, `date`, `category`, `filters`, `location_context`
|
||||
- Provider/model resolution via `_resolve_provider_and_model()`: request payload → `UserAISettings` defaults → instance defaults (`VOYAGE_AI_PROVIDER`/`VOYAGE_AI_MODEL`) → provider config default. No hardcoded OpenAI fallback.
|
||||
- Cross-provider model guard: `preferred_model` only applied when provider matches `preferred_provider`.
|
||||
- LLM call via `litellm.completion` with regex JSON extraction fallback
|
||||
- Suggestion normalization: frontend `normalizeSuggestionItem()` handles LLM response variants (title/place_name/venue, summary/details, address/neighborhood). Items without resolvable name are dropped.
|
||||
- Add-to-itinerary: `buildLocationPayload()` constructs `LocationSerializer`-compatible payload (name/location/description/rating/collections/is_public) from normalized suggestion.
|
||||
|
||||
### Capabilities Endpoint
|
||||
- `GET /api/chat/capabilities/` returns `{ "tools": [{ "name", "description" }, ...] }` from registry
|
||||
|
||||
## WS4-F4 Chat UI Rendering
|
||||
- Travel-themed header (icon: airplane, title: `Travel Assistant` with optional collection name suffix)
|
||||
- `ChatMessage` type supports `tool_results?: Array<{ name, result }>` for inline tool output
|
||||
- SSE handling appends to current assistant message's `tool_results` array
|
||||
- Renderer: `search_places` -> place cards, `web_search` -> linked cards, fallback -> JSON `<pre>`
|
||||
|
||||
## WS4-F3 Add-to-itinerary from Chat
|
||||
- `search_places` card results can be added directly to itinerary when collection context exists
|
||||
- Flow: date selector modal -> `POST /api/locations/` -> `POST /api/itineraries/` -> `itemAdded` event
|
||||
- Coordinate guard (`hasPlaceCoordinates`) required
|
||||
65
.memory/knowledge/tech-stack.md
Normal file
65
.memory/knowledge/tech-stack.md
Normal file
@@ -0,0 +1,65 @@
|
||||
# Tech Stack & Development
|
||||
|
||||
## Stack
|
||||
- **Frontend**: SvelteKit 2, TypeScript, Bun (package manager), DaisyUI + Tailwind CSS, svelte-i18n, svelte-maplibre
|
||||
- **Backend**: Django REST Framework, Python, django-allauth, djmoney, django-geojson, LiteLLM, duckduckgo-search
|
||||
- **Database**: PostgreSQL + PostGIS
|
||||
- **Cache**: Memcached
|
||||
- **Infrastructure**: Docker, Docker Compose
|
||||
- **Repo**: github.com/Alex-Wiesner/voyage
|
||||
- **License**: GNU GPL v3.0
|
||||
|
||||
## Development Commands
|
||||
|
||||
### Frontend (prefer Bun)
|
||||
- `cd frontend && bun run format` — fix formatting (6s)
|
||||
- `cd frontend && bun run lint` — check formatting (6s)
|
||||
- `cd frontend && bun run check` — Svelte type checking (12s; 0 errors, 6 warnings expected)
|
||||
- `cd frontend && bun run build` — build (32s)
|
||||
- `cd frontend && bun install` — install deps (45s)
|
||||
|
||||
### Backend (Docker required; uv for local Python tooling)
|
||||
- `docker compose exec server python3 manage.py test` — run tests (7s; 6/30 pre-existing failures expected)
|
||||
- `docker compose exec server python3 manage.py migrate` — run migrations
|
||||
|
||||
### Pre-Commit Checklist
|
||||
1. `cd frontend && bun run format`
|
||||
2. `cd frontend && bun run lint`
|
||||
3. `cd frontend && bun run check`
|
||||
4. `cd frontend && bun run build`
|
||||
|
||||
## Environment & Configuration
|
||||
|
||||
### .env Loading
|
||||
- **Library**: `python-dotenv==1.2.2` (in `backend/server/requirements.txt`)
|
||||
- **Entry point**: `backend/server/main/settings.py` calls `load_dotenv()` at module top
|
||||
- **Docker**: `docker-compose.yml` sets `env_file: .env` on all services — single root `.env` file shared
|
||||
- **Root `.env`**: `/home/alex/projects/voyage/.env` — canonical for Docker Compose setups
|
||||
|
||||
### Settings File
|
||||
- **Single file**: `backend/server/main/settings.py` (no split/environment-specific settings files)
|
||||
|
||||
### Server-side Env Vars (from `settings.py`)
|
||||
| Var | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `SECRET_KEY` | (required) | Django secret key |
|
||||
| `GOOGLE_MAPS_API_KEY` | `""` | Google Maps integration |
|
||||
| `STRAVA_CLIENT_ID` / `STRAVA_CLIENT_SECRET` | `""` | Strava OAuth |
|
||||
| `FIELD_ENCRYPTION_KEY` | `""` | Fernet key for `UserAPIKey` encryption |
|
||||
| `OSRM_BASE_URL` | `"https://router.project-osrm.org"` | Routing service |
|
||||
| `VOYAGE_AI_PROVIDER` | `"openai"` | Instance-level default AI provider |
|
||||
| `VOYAGE_AI_MODEL` | `"gpt-4o-mini"` | Instance-level default AI model |
|
||||
| `VOYAGE_AI_API_KEY` | `""` | Instance-level AI API key |
|
||||
|
||||
### Per-User LLM API Key Pattern
|
||||
LLM provider keys stored per-user in DB (`UserAPIKey` model, `integrations/models.py`):
|
||||
- `UserAPIKey` table: `(user, provider)` unique pair → `encrypted_api_key` (Fernet-encrypted text field)
|
||||
- `FIELD_ENCRYPTION_KEY` env var required for encrypt/decrypt
|
||||
- `llm_client.get_llm_api_key(user, provider)` → user key → instance key fallback (matching provider only) → `None`
|
||||
- No global server-side LLM API keys — every user must configure their own per-provider key via Settings UI (or instance admin configures fallback)
|
||||
|
||||
## Known Issues
|
||||
- Docker dev setup has frontend-backend communication issues (500 errors beyond homepage)
|
||||
- Frontend check: 0 errors, 6 warnings expected (pre-existing in `CollectionRecommendationView.svelte` + `RegionCard.svelte`)
|
||||
- Backend tests: 6/30 pre-existing failures (2 user email key errors + 4 geocoding API mocks)
|
||||
- Local Python pip install fails (network timeouts) — use Docker
|
||||
83
.memory/manifest.yaml
Normal file
83
.memory/manifest.yaml
Normal file
@@ -0,0 +1,83 @@
|
||||
name: voyage
|
||||
version: 1
|
||||
categories:
|
||||
- path: system.md
|
||||
description: One-paragraph project overview — purpose, stack, status
|
||||
group: knowledge
|
||||
|
||||
- path: knowledge/overview.md
|
||||
description: Architecture overview — API proxy, AI chat, services, auth, file locations
|
||||
group: knowledge
|
||||
|
||||
- path: knowledge/tech-stack.md
|
||||
description: Stack details, dev commands, environment config, env vars, known issues
|
||||
group: knowledge
|
||||
|
||||
- path: knowledge/conventions.md
|
||||
description: Coding conventions — frontend/backend patterns, workflow rules
|
||||
group: knowledge
|
||||
|
||||
- path: knowledge/patterns/chat-and-llm.md
|
||||
description: Chat & LLM patterns — model override, error mapping, agent tools, OpenCode Zen, context derivation
|
||||
group: knowledge
|
||||
|
||||
- path: knowledge/domain/collections-and-sharing.md
|
||||
description: Collections domain — sharing architecture, itinerary planner, user preferences
|
||||
group: knowledge
|
||||
|
||||
- path: knowledge/domain/ai-configuration.md
|
||||
description: AI configuration domain — WS1 infrastructure, provider catalog, frontend gaps
|
||||
group: knowledge
|
||||
|
||||
- path: decisions.md
|
||||
description: Architecture Decision Records — fork rationale, tooling choices, review/test verdicts, critic gates
|
||||
group: decisions
|
||||
|
||||
- path: plans/ai-travel-agent-collections-integration.md
|
||||
description: Plan for AI travel chat embedding in Collections + provider catalog (original integration)
|
||||
group: plans
|
||||
|
||||
- path: plans/opencode-zen-connection-error.md
|
||||
description: Plan for OpenCode Zen connection fix — model default change, error surfacing, model selection UI
|
||||
group: plans
|
||||
|
||||
- path: plans/ai-travel-agent-redesign.md
|
||||
description: Plan for AI travel agent redesign — WS1-WS6 workstreams (config, preferences, suggestions, chat UI, web search, extensibility)
|
||||
group: plans
|
||||
|
||||
- path: plans/travel-agent-context-and-models.md
|
||||
description: "Plan for follow-up fixes: F1 model dropdown expansion, F2 multi-stop context, F3 itinerary-centric prompts + .results key fix (COMPLETE)"
|
||||
group: plans
|
||||
|
||||
- path: plans/pre-release-and-memory-migration.md
|
||||
description: Plan for pre-release policy addition + .memory structure migration
|
||||
group: plans
|
||||
|
||||
- path: research/litellm-zen-provider-catalog.md
|
||||
description: Research on LiteLLM provider catalog and OpenCode Zen API compatibility
|
||||
group: research
|
||||
|
||||
- path: research/opencode-zen-connection-debug.md
|
||||
description: Debug findings for OpenCode Zen connection errors and model routing
|
||||
group: research
|
||||
|
||||
- path: research/auto-learn-preference-signals.md
|
||||
description: Research on auto-learning user travel preference signals from history data
|
||||
group: research
|
||||
|
||||
- path: research/provider-strategy.md
|
||||
description: "Research: multi-provider strategy — LiteLLM hardening vs replacement, retry/fallback patterns"
|
||||
group: research
|
||||
|
||||
- path: sessions/continuity.md
|
||||
description: Rolling session continuity notes — last session context, active work
|
||||
group: sessions
|
||||
|
||||
- path: plans/chat-provider-fixes.md
|
||||
description: "Chat provider fixes plan (COMPLETE) — chat-loop-hardening, default-ai-settings, suggestion-add-flow workstreams with full review/test records"
|
||||
group: plans
|
||||
|
||||
# Deprecated (content migrated)
|
||||
- path: knowledge.md
|
||||
description: "DEPRECATED — migrated to knowledge/ nested structure. See knowledge/ files."
|
||||
group: knowledge
|
||||
0
.memory/plans/.gitkeep
Normal file
0
.memory/plans/.gitkeep
Normal file
108
.memory/plans/ai-travel-agent-collections-integration.md
Normal file
108
.memory/plans/ai-travel-agent-collections-integration.md
Normal file
@@ -0,0 +1,108 @@
|
||||
# Plan: AI travel agent in Collections Recommendations
|
||||
|
||||
## Clarified requirements
|
||||
- Move AI travel agent UX from standalone `/chat` tab/page into Collections → Recommendations.
|
||||
- Remove the existing `/chat` route (not keep/redirect).
|
||||
- Provider list should be dynamic and display all providers LiteLLM supports.
|
||||
- Ensure OpenCode Zen is supported as a provider.
|
||||
|
||||
## Execution prerequisites
|
||||
- In each worktree, run `cd frontend && npm install` before implementation to ensure node modules (including `@mdi/js`) are present and baseline build can run.
|
||||
|
||||
## Decomposition (approved by user)
|
||||
|
||||
### Workstream 1 — Collections recommendations chat integration (Frontend + route cleanup)
|
||||
- **Worktree**: `.worktrees/collections-ai-agent`
|
||||
- **Branch**: `feat/collections-ai-agent`
|
||||
- **Risk**: Medium
|
||||
- **Quality tier**: Tier 2
|
||||
- **Task WS1-F1**: Embed AI chat experience inside Collections Recommendations UI.
|
||||
- **Acceptance criteria**:
|
||||
- Chat UI is available from Collections Recommendations section.
|
||||
- Existing recommendations functionality remains usable.
|
||||
- Chat interactions continue to work with existing backend chat APIs.
|
||||
- **Task WS1-F2**: Remove standalone `/chat` route/page.
|
||||
- **Acceptance criteria**:
|
||||
- `/chat` page is removed from app routes/navigation.
|
||||
- No broken imports/navigation links remain.
|
||||
|
||||
### Workstream 2 — Provider catalog + Zen provider support (Backend + frontend settings/chat)
|
||||
- **Worktree**: `.worktrees/litellm-provider-catalog`
|
||||
- **Branch**: `feat/litellm-provider-catalog`
|
||||
- **Risk**: Medium
|
||||
- **Quality tier**: Tier 2 (promote to Tier 1 if auth/secret handling changes)
|
||||
- **Task WS2-F1**: Implement dynamic provider listing based on LiteLLM-supported providers.
|
||||
- **Acceptance criteria**:
|
||||
- Backend exposes `GET /api/chat/providers/` using LiteLLM runtime provider list as source data.
|
||||
- Frontend provider selectors consume backend provider catalog rather than hardcoded arrays.
|
||||
- UI displays all LiteLLM provider IDs and metadata; non-chat-compatible providers are labeled unavailable.
|
||||
- Existing saved provider/API-key flows still function.
|
||||
- **Task WS2-F2**: Add/confirm OpenCode Zen provider support end-to-end.
|
||||
- **Acceptance criteria**:
|
||||
- OpenCode Zen appears as provider id `opencode_zen`.
|
||||
- Backend model resolution and API-key lookup work for `opencode_zen`.
|
||||
- Zen calls use LiteLLM OpenAI-compatible routing with `api_base=https://opencode.ai/zen/v1`.
|
||||
- Chat requests using Zen provider are accepted without fallback/validation failures.
|
||||
|
||||
## Provider architecture decision
|
||||
- Backend provider catalog endpoint `GET /api/chat/providers/` is the single source of truth for UI provider options.
|
||||
- Endpoint response fields: `id`, `label`, `available_for_chat`, `needs_api_key`, `default_model`, `api_base`.
|
||||
- All LiteLLM runtime providers are returned; entries without model mapping are `available_for_chat=false`.
|
||||
- Chat send path only accepts providers where `available_for_chat=true`.
|
||||
|
||||
## Research findings (2026-03-08)
|
||||
- LiteLLM provider enumeration is available at runtime (`litellm.provider_list`), currently 128 providers in this environment.
|
||||
- OpenCode Zen is not a native LiteLLM provider alias; support should be implemented via OpenAI-compatible provider config and explicit `api_base`.
|
||||
- Existing hardcoded provider duplication (backend + chat page + settings page) will be replaced by backend catalog consumption.
|
||||
- Reference: [LiteLLM + Zen provider research](../research/litellm-zen-provider-catalog.md)
|
||||
|
||||
## Dependencies
|
||||
- WS1 depends on existing chat API endpoint behavior and event streaming contract.
|
||||
- WS2 depends on LiteLLM provider metadata/query capabilities and provider-catalog endpoint design.
|
||||
- WS1-F1 depends on WS2 completion for dynamic provider selector integration.
|
||||
- WS1-F2 depends on WS1-F1 completion.
|
||||
|
||||
## Human checkpoints
|
||||
- No checkpoint required: Zen support path uses existing LiteLLM dependency via OpenAI-compatible API (no new SDK/service).
|
||||
|
||||
## Findings tracker
|
||||
- WS1-F1 implemented in worktree `.worktrees/collections-ai-agent`:
|
||||
- Extracted chat route UI into reusable component `frontend/src/lib/components/AITravelChat.svelte`, preserving conversation list, message stream rendering, provider selector, conversation CRUD, and SSE send-message flow via `/api/chat/conversations/*`.
|
||||
- Updated `frontend/src/routes/chat/+page.svelte` to render the reusable component so existing `/chat` behavior remains intact for WS1-F1 scope (WS1-F2 route removal deferred).
|
||||
- Embedded `AITravelChat` into Collections Recommendations view in `frontend/src/routes/collections/[id]/+page.svelte` above `CollectionRecommendationView`, keeping existing recommendation search/map/create flows unchanged.
|
||||
- Reviewer warning resolved: removed redundant outer card wrapper around `AITravelChat` in Collections Recommendations embedding, eliminating nested card-in-card styling while preserving spacing and recommendations placement.
|
||||
- WS1-F2 implemented in worktree `.worktrees/collections-ai-agent`:
|
||||
- Removed standalone chat route page by deleting `frontend/src/routes/chat/+page.svelte`.
|
||||
- Removed `/chat` navigation item from `frontend/src/lib/components/Navbar.svelte`, including the now-unused `mdiRobotOutline` icon import.
|
||||
- Verified embedded chat remains in Collections Recommendations via `AITravelChat` usage in `frontend/src/routes/collections/[id]/+page.svelte`; no remaining `/chat` route links/imports in `frontend/src`.
|
||||
- WS2-F1 implemented in worktree `.worktrees/litellm-provider-catalog`:
|
||||
- Added backend provider catalog endpoint `GET /api/chat/providers/` from `litellm.provider_list` with response fields `id`, `label`, `available_for_chat`, `needs_api_key`, `default_model`, `api_base`.
|
||||
- Refactored chat provider model map into `CHAT_PROVIDER_CONFIG` in `backend/server/chat/llm_client.py` and reused it for both send-message routing and provider catalog metadata.
|
||||
- Updated chat/settings frontend provider consumers to fetch provider catalog dynamically and removed hardcoded provider arrays.
|
||||
- Chat UI now restricts provider selection/sending to `available_for_chat=true`; settings API key UI now lists full provider catalog (including unavailable-for-chat entries).
|
||||
- WS2-F1 reviewer carry-forward fixes applied:
|
||||
- Fixed chat provider selection fallback timing in `frontend/src/routes/chat/+page.svelte` by computing `availableProviders` from local `catalog` response data instead of relying on reactive `chatProviders` immediately after assignment.
|
||||
- Applied low-risk settings improvement in `frontend/src/routes/settings/+page.svelte` by changing `await loadProviderCatalog()` to `void loadProviderCatalog()` in the second `onMount`, preventing provider fetch from delaying success toast logic.
|
||||
- WS2-F2 implemented in worktree `.worktrees/litellm-provider-catalog`:
|
||||
- Added `opencode_zen` to `CHAT_PROVIDER_CONFIG` in `backend/server/chat/llm_client.py` with label `OpenCode Zen`, `needs_api_key=true`, `default_model=openai/gpt-4o-mini`, and `api_base=https://opencode.ai/zen/v1`.
|
||||
- Updated `get_provider_catalog()` to append configured chat providers not present in `litellm.provider_list`, ensuring OpenCode Zen appears in `GET /api/chat/providers/` even though it is an OpenAI-compatible alias rather than a native LiteLLM provider id.
|
||||
- Normalized provider IDs in `get_llm_api_key()` and `stream_chat_completion()` via `_normalize_provider_id()` to keep API-key lookup and LLM request routing consistent for `opencode_zen`.
|
||||
- Consolidation completed in worktree `.worktrees/collections-ai-agent`:
|
||||
- Ported WS2 provider-catalog backend to `backend/server/chat` in the collections branch, including `GET /api/chat/providers/`, `CHAT_PROVIDER_CONFIG` metadata fields (`label`, `needs_api_key`, `default_model`, `api_base`), and chat-send validation to allow only `available_for_chat` providers.
|
||||
- Confirmed `opencode_zen` support in consolidated branch with `label=OpenCode Zen`, `default_model=openai/gpt-4o-mini`, `api_base=https://opencode.ai/zen/v1`, and API-key-required behavior.
|
||||
- Replaced hardcoded providers in `frontend/src/lib/components/AITravelChat.svelte` with dynamic `/api/chat/providers/` loading, preserving send guard to chat-available providers only.
|
||||
- Updated settings API-key provider dropdown in `frontend/src/routes/settings/+page.svelte` to load full provider catalog dynamically and added `ChatProviderCatalogEntry` type in `frontend/src/lib/types.ts`.
|
||||
- Preserved existing collections chat embedding and kept standalone `/chat` route removed (no route reintroduction in consolidation changes).
|
||||
|
||||
## Retry tracker
|
||||
- WS1-F1: 0
|
||||
- WS1-F2: 0
|
||||
- WS2-F1: 0
|
||||
- WS2-F2: 0
|
||||
|
||||
## Execution checklist
|
||||
- [x] WS2-F1 Dynamic provider listing from LiteLLM (Tier 2)
|
||||
- [x] WS2-F2 OpenCode Zen provider support (Tier 2)
|
||||
- [x] WS1-F1 Embed AI chat into Collections Recommendations (Tier 2)
|
||||
- [x] WS1-F2 Remove standalone `/chat` route (Tier 2)
|
||||
- [x] Documentation coverage + knowledge sync (Librarian)
|
||||
338
.memory/plans/ai-travel-agent-redesign.md
Normal file
338
.memory/plans/ai-travel-agent-redesign.md
Normal file
@@ -0,0 +1,338 @@
|
||||
# AI Travel Agent Redesign Plan
|
||||
|
||||
## Vision Summary
|
||||
|
||||
Redesign the AI travel agent with two context-aware entry points, user preference learning, flexible provider configuration, extensibility for future integrations, web search capability, and multi-user collection support.
|
||||
|
||||
---
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ ENTRY POINTS │
|
||||
├─────────────────────────────┬───────────────────────────────────┤
|
||||
│ Day-Level Suggestions │ Collection-Level Chat │
|
||||
│ (new modal) │ (improved Recommendations tab) │
|
||||
│ - Category filters │ - Context-aware │
|
||||
│ - Sub-filters │ - Add to itinerary actions │
|
||||
│ - Add to day action │ │
|
||||
└─────────────────────────────┴───────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ AGENT CORE │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ - LiteLLM backend (streaming SSE) │
|
||||
│ - Tool calling (place search, web search, itinerary actions) │
|
||||
│ - Multi-user preference aggregation │
|
||||
│ - Context injection (collection, dates, location) │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ CONFIGURATION LAYERS │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Instance (.env) → VOYAGE_AI_PROVIDER │
|
||||
│ → VOYAGE_AI_MODEL │
|
||||
│ → VOYAGE_AI_API_KEY │
|
||||
│ User (DB) → UserAPIKey.per-provider keys │
|
||||
│ → UserAISettings.model preference │
|
||||
│ Fallback: User key → Instance key → Error │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Workstreams
|
||||
|
||||
### WS1: Configuration Infrastructure
|
||||
|
||||
**Goal**: Support both instance-level and user-level provider/model configuration with proper fallback.
|
||||
|
||||
#### WS1-F1: Instance-level configuration
|
||||
- Add env vars to `settings.py`:
|
||||
- `VOYAGE_AI_PROVIDER` (default: `openai`)
|
||||
- `VOYAGE_AI_MODEL` (default: `gpt-4o-mini`)
|
||||
- `VOYAGE_AI_API_KEY` (optional global key)
|
||||
- Update `llm_client.py` to read instance defaults
|
||||
- Add fallback chain: user key → instance key → error
|
||||
|
||||
#### WS1-F2: User-level model preferences
|
||||
- Add `UserAISettings` model (OneToOne → CustomUser):
|
||||
- `preferred_provider` (CharField)
|
||||
- `preferred_model` (CharField)
|
||||
- Create API endpoint: `POST /api/ai/settings/`
|
||||
- Add UI in Settings → AI section for model selection
|
||||
|
||||
#### WS1-F3: Provider catalog enhancement
|
||||
- Extend provider catalog response to include:
|
||||
- `instance_configured`: bool (has instance key)
|
||||
- `user_configured`: bool (has user key)
|
||||
- Update frontend to show configuration status per provider
|
||||
|
||||
**Files**: `settings.py`, `llm_client.py`, `integrations/models.py`, `integrations/views/`, `frontend/src/routes/settings/`
|
||||
|
||||
---
|
||||
|
||||
### WS2: User Preference Learning
|
||||
|
||||
**Goal**: Capture and use user preferences in AI recommendations.
|
||||
|
||||
#### WS2-F1: Preference UI
|
||||
- Add "AI Preferences" tab to Settings page
|
||||
- Form fields: cuisines, interests, trip_style, notes
|
||||
- Use tag input for cuisines/interests (better UX than free text)
|
||||
- Connect to existing `/api/integrations/recommendation-preferences/`
|
||||
|
||||
Implementation notes (2026-03-08):
|
||||
- Implemented in `frontend/src/routes/settings/+page.svelte` as `travel_preferences` section in the existing settings sidebar, with `savePreferences(event)` posting to `/api/integrations/recommendation-preferences/`.
|
||||
- `interests` conversion is string↔array at UI boundary: load via `(profile.interests || []).join(', ')`; save via `.split(',').map((s) => s.trim()).filter(Boolean)`.
|
||||
- SSR preload added in `frontend/src/routes/settings/+page.server.ts` using parallel fetch with API keys; returns `props.recommendationProfile` as first list element or `null`.
|
||||
- Frontend typing added in `frontend/src/lib/types.ts` (`UserRecommendationPreferenceProfile`) and i18n strings added under `settings` in `frontend/src/locales/en.json`.
|
||||
- See backend capability reference in [Project Knowledge — User Recommendation Preference Profile](../knowledge.md#user-recommendation-preference-profile).
|
||||
|
||||
#### WS2-F2: Preference injection
|
||||
- Enhance `get_system_prompt()` to format preferences better
|
||||
- Add preference summary in system prompt (structured, not just appended)
|
||||
|
||||
#### WS2-F3: Multi-user aggregation
|
||||
- New function: `get_aggregated_preferences(collection)`
|
||||
- Returns combined preferences from all `collection.shared_with` users + owner
|
||||
- Format: "Party preferences: User A likes X, User B prefers Y..."
|
||||
- Inject into system prompt for shared collections
|
||||
|
||||
**Files**: `frontend/src/routes/settings/`, `chat/llm_client.py`, `integrations/models.py`
|
||||
|
||||
---
|
||||
|
||||
### WS3: Day-Level Suggestions Modal
|
||||
|
||||
**Goal**: Add "Suggest" option to itinerary day "Add" dropdown with category filters.
|
||||
|
||||
#### WS3-F1: Suggestion modal component
|
||||
- Create `ItinerarySuggestionModal.svelte`
|
||||
- Two-step flow:
|
||||
1. **Category selection**: Restaurant, Activity, Event, Lodging
|
||||
2. **Filter refinement**:
|
||||
- Restaurant: cuisine type, price range, dietary restrictions
|
||||
- Activity: type (outdoor, cultural, etc.), duration
|
||||
- Event: type, date/time preference
|
||||
- Lodging: type, amenities
|
||||
- "Any/Surprise me" option for each filter
|
||||
|
||||
#### WS3-F2: Add button integration
|
||||
- Add "Get AI suggestions" option to `CollectionItineraryPlanner.svelte` Add dropdown
|
||||
- Opens suggestion modal with target date pre-set
|
||||
- Modal receives: `collectionId`, `targetDate`, `collectionLocation` (for context)
|
||||
|
||||
#### WS3-F3: Suggestion results display
|
||||
- Show 3-5 suggestions as cards with:
|
||||
- Name, description, why it fits preferences
|
||||
- "Add to this day" button
|
||||
- "Add to different day" option
|
||||
- On add: **direct REST API call** to `/api/itineraries/` (not agent tool)
|
||||
- User must approve each item individually - no bulk/auto-add
|
||||
- Close modal and refresh itinerary on success
|
||||
|
||||
#### WS3-F4: Backend suggestion endpoint
|
||||
- New endpoint: `POST /api/ai/suggestions/day/`
|
||||
- Params: `collection_id`, `date`, `category`, `filters`, `location_context`
|
||||
- Returns structured suggestions (not chat, direct JSON)
|
||||
- Uses agent internally but returns parsed results
|
||||
|
||||
**Files**: `CollectionItineraryPlanner.svelte`, `ItinerarySuggestionModal.svelte` (new), `chat/views.py`, `chat/agent_tools.py`
|
||||
|
||||
---
|
||||
|
||||
### WS3.5: Insertion Flow Clarification
|
||||
|
||||
**Two insertion paths exist:**
|
||||
|
||||
| Path | Entry Point | Mechanism | Use Case |
|
||||
|------|-------------|-----------|----------|
|
||||
| **User-approved** | Suggestions modal | Direct REST API call to `/api/itineraries/` | Day-level suggestions, user reviews and clicks Add |
|
||||
| **Agent-initiated** | Chat (Recommendations tab) | `add_to_itinerary` tool via SSE streaming | Conversational adds when user says "add that place" |
|
||||
|
||||
**Why two paths:**
|
||||
- Modal: Faster, simpler UX - no round-trip through agent, user stays in control
|
||||
- Chat: Natural conversation flow - agent can add as part of dialogue
|
||||
|
||||
**No changes needed to agent tools** - `add_to_itinerary` already exists in `agent_tools.py` and works for chat-initiated adds.
|
||||
|
||||
---
|
||||
|
||||
### WS4: Collection-Level Chat Improvements
|
||||
|
||||
**Goal**: Make Recommendations tab chat context-aware and action-capable.
|
||||
|
||||
#### WS4-F1: Context injection
|
||||
- Pass collection context to `AITravelChat.svelte`:
|
||||
- `collectionId`, `collectionName`, `startDate`, `endDate`
|
||||
- `destination` (from collection locations or user input)
|
||||
- Inject into system prompt: "You are helping plan a trip to X from Y to Z"
|
||||
|
||||
Implementation notes (2026-03-08):
|
||||
- `frontend/src/lib/components/AITravelChat.svelte` now exposes optional context props (`collectionId`, `collectionName`, `startDate`, `endDate`, `destination`) and includes them in `POST /api/chat/conversations/{id}/send_message/` payload.
|
||||
- `frontend/src/routes/collections/[id]/+page.svelte` now passes collection context into `AITravelChat`; destination is derived via `deriveCollectionDestination(...)` from `city/country/location/name` on the first usable location.
|
||||
- `backend/server/chat/views/__init__.py::ChatViewSet.send_message()` now accepts the same optional fields, resolves `collection_id` (owner/shared access only), and appends a `## Trip Context` block to the system prompt before streaming.
|
||||
- Related architecture note: [Project Knowledge — AI Chat](../knowledge.md#ai-chat-collections--recommendations).
|
||||
|
||||
#### WS4-F2: Quick action buttons
|
||||
- Add preset prompts above chat input:
|
||||
- "Suggest restaurants for this trip"
|
||||
- "Find activities near [destination]"
|
||||
- "What should I pack for [dates]?"
|
||||
- Pre-fill input on click
|
||||
|
||||
#### WS4-F3: Add-to-itinerary from chat
|
||||
- When agent suggests a place, show "Add to itinerary" button
|
||||
- User selects date → calls `add_to_itinerary` tool
|
||||
- Visual feedback on success
|
||||
|
||||
Implementation notes (2026-03-08):
|
||||
- Implemented in `frontend/src/lib/components/AITravelChat.svelte` as an MVP direct frontend flow (no agent round-trip):
|
||||
- Adds `Add to Itinerary` button to `search_places` result cards when `collectionId` exists.
|
||||
- Opens a date picker modal (`showDateSelector`, `selectedPlaceToAdd`, `selectedDate`) constrained by trip date range (`min={startDate}`, `max={endDate}`).
|
||||
- On confirm, creates a location via `POST /api/locations/` then creates itinerary entry via `POST /api/itineraries/`.
|
||||
- Dispatches `itemAdded { locationId, date }` and shows success toast (`added_successfully`).
|
||||
- Guards against missing/invalid coordinates by disabling add action unless lat/lon parse successfully.
|
||||
- i18n keys added in `frontend/src/locales/en.json`: `add_to_itinerary`, `add_to_which_day`, `added_successfully`.
|
||||
|
||||
#### WS4-F4: Improved UI
|
||||
- Remove generic "robot" branding, use travel-themed design
|
||||
- Show collection name in header
|
||||
- Better tool result display (cards instead of raw JSON)
|
||||
|
||||
Implementation notes (2026-03-08):
|
||||
- `frontend/src/lib/components/AITravelChat.svelte` header now uses travel branding with `✈️` and renders `Travel Assistant · {collectionName}` when collection context is present; destination is shown as a subtitle when provided.
|
||||
- Robot icon usage in chat UI was replaced with travel-themed emoji (`✈️`, `🌍`, `🗺️`) while keeping existing layout structure.
|
||||
- SSE `tool_result` chunks are now attached to the in-flight assistant message via `tool_results` and rendered inline as structured cards for `search_places` and `web_search`, with JSON `<pre>` fallback for unknown tools.
|
||||
- Legacy persisted `role: 'tool'` messages are still supported via JSON parsing fallback and use the same card rendering logic.
|
||||
- i18n root keys added in `frontend/src/locales/en.json`: `travel_assistant`, `quick_actions`.
|
||||
|
||||
See [Project Knowledge — WS4-F4 Chat UI Rendering](../knowledge.md#ws4-f4-chat-ui-rendering).
|
||||
|
||||
**Files**: `AITravelChat.svelte`, `chat/views.py`, `chat/llm_client.py`
|
||||
|
||||
---
|
||||
|
||||
### WS5: Web Search Capability
|
||||
|
||||
**Goal**: Enable agent to search the web for current information.
|
||||
|
||||
#### WS5-F1: Web search tool
|
||||
- Add `web_search` tool to `agent_tools.py`:
|
||||
- Uses DuckDuckGo (free, no API key) or Brave Search API (env var)
|
||||
- Returns top 5 results with titles, snippets, URLs
|
||||
- Tool schema:
|
||||
```python
|
||||
{
|
||||
"name": "web_search",
|
||||
"description": "Search the web for current information about destinations, events, prices, etc.",
|
||||
"parameters": {
|
||||
"query": "string - search query",
|
||||
"location_context": "string - optional location to bias results"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### WS5-F2: Tool integration
|
||||
- Register in `AGENT_TOOLS` list
|
||||
- Add to `execute_tool()` dispatcher
|
||||
- Handle rate limiting gracefully
|
||||
|
||||
**Files**: `chat/agent_tools.py`, `requirements.txt` (add `duckduckgo-search`)
|
||||
|
||||
---
|
||||
|
||||
### WS6: Extensibility Architecture
|
||||
|
||||
**Goal**: Design for easy addition of future integrations.
|
||||
|
||||
#### WS6-F1: Plugin tool registry
|
||||
- Refactor `agent_tools.py` to use decorator-based registration:
|
||||
```python
|
||||
@agent_tool(name="web_search", description="...")
|
||||
def web_search(query: str, location_context: str = None):
|
||||
...
|
||||
```
|
||||
- Tools auto-register on import
|
||||
- Easy to add new tools in separate files
|
||||
|
||||
#### WS6-F2: Integration hooks
|
||||
- Create `chat/integrations/` directory for future:
|
||||
- `tripadvisor.py` - TripAdvisor API integration
|
||||
- `flights.py` - Flight search (Skyscanner, etc.)
|
||||
- `weather.py` - Enhanced weather data
|
||||
- Each integration exports tools via decorator
|
||||
|
||||
#### WS6-F3: Capability discovery
|
||||
- Endpoint: `GET /api/ai/capabilities/`
|
||||
- Returns list of available tools/integrations
|
||||
- Frontend can show "Powered by X, Y, Z" dynamically
|
||||
|
||||
**Files**: `chat/tools/` (new directory), `chat/agent_tools.py` (refactor)
|
||||
|
||||
---
|
||||
|
||||
## File Changes Summary
|
||||
|
||||
### New Files
|
||||
- `frontend/src/lib/components/collections/ItinerarySuggestionModal.svelte`
|
||||
- `backend/server/chat/tools/__init__.py`
|
||||
- `backend/server/chat/tools/web_search.py`
|
||||
- `backend/server/integrations/models.py` (add UserAISettings)
|
||||
- `backend/server/integrations/views/ai_settings_view.py`
|
||||
|
||||
### Modified Files
|
||||
- `backend/server/main/settings.py` - Add AI env vars
|
||||
- `backend/server/chat/llm_client.py` - Config fallback, preference aggregation
|
||||
- `backend/server/chat/views.py` - New suggestion endpoint, context injection
|
||||
- `backend/server/chat/agent_tools.py` - Web search tool, refactor
|
||||
- `frontend/src/lib/components/AITravelChat.svelte` - Context awareness, actions
|
||||
- `frontend/src/lib/components/collections/CollectionItineraryPlanner.svelte` - Add button
|
||||
- `frontend/src/routes/settings/+page.svelte` - AI preferences UI, model selection
|
||||
- `frontend/src/routes/collections/[id]/+page.svelte` - Pass collection context
|
||||
|
||||
---
|
||||
|
||||
## Migration Path
|
||||
|
||||
1. **Phase 1 - Foundation** (WS1, WS2)
|
||||
- Configuration infrastructure
|
||||
- Preference UI
|
||||
- No user-facing changes to chat yet
|
||||
|
||||
2. **Phase 2 - Day Suggestions** (WS3)
|
||||
- New modal, new entry point
|
||||
- Backend suggestion endpoint
|
||||
- Can ship independently
|
||||
|
||||
3. **Phase 3 - Chat Improvements** (WS4, WS5)
|
||||
- Context-aware chat
|
||||
- Web search capability
|
||||
- Better UX
|
||||
|
||||
4. **Phase 4 - Extensibility** (WS6)
|
||||
- Plugin architecture
|
||||
- Future integration prep
|
||||
|
||||
---
|
||||
|
||||
## Decisions (Confirmed)
|
||||
|
||||
| Decision | Choice | Rationale |
|
||||
|----------|--------|-----------|
|
||||
| Web search provider | **DuckDuckGo** | Free, no API key, good enough for travel info |
|
||||
| Suggestion API | **Dedicated REST endpoint** | Simpler, faster, returns JSON directly |
|
||||
| Multi-user conflicts | **List all preferences** | Transparency - AI navigates differing preferences |
|
||||
|
||||
---
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- WSGI→ASGI migration (keep current async-in-sync pattern)
|
||||
- Role-based permissions (all shared users have same access)
|
||||
- Real-time collaboration (WebSocket sync)
|
||||
- Mobile-specific optimizations
|
||||
248
.memory/plans/chat-provider-fixes.md
Normal file
248
.memory/plans/chat-provider-fixes.md
Normal file
@@ -0,0 +1,248 @@
|
||||
# Chat Provider Fixes
|
||||
|
||||
## Problem Statement
|
||||
The AI chat feature is broken with multiple issues:
|
||||
1. Rate limit errors from providers
|
||||
2. "location is required" errors (tool calling issue)
|
||||
3. "An unexpected error occurred while fetching trip details" errors
|
||||
4. Models not being fetched properly for all providers
|
||||
5. Potential authentication issues
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
### Issue 1: Tool Calling Errors
|
||||
The errors "location is required" and "An unexpected error occurred while fetching trip details" come from the agent tools (`search_places`, `get_trip_details`) being called with missing/invalid parameters. This suggests:
|
||||
- The LLM is not properly understanding the tool schemas
|
||||
- Or the model doesn't support function calling well
|
||||
- Or there's a mismatch between how LiteLLM formats tools and what the model expects
|
||||
|
||||
### Issue 2: Models Not Fetched
|
||||
The `models` endpoint in `ChatProviderCatalogViewSet` only handles:
|
||||
- `openai` - uses OpenAI SDK to fetch live
|
||||
- `anthropic/claude` - hardcoded list
|
||||
- `gemini/google` - hardcoded list
|
||||
- `groq` - hardcoded list
|
||||
- `ollama` - calls local API
|
||||
- `opencode_zen` - hardcoded list
|
||||
|
||||
All other providers return `{"models": []}`.
|
||||
|
||||
### Issue 3: Authentication Flow
|
||||
1. Frontend sends request with `credentials: 'include'`
|
||||
2. Backend gets user from session
|
||||
3. `get_llm_api_key()` checks `UserAPIKey` model for user's key
|
||||
4. Falls back to `settings.VOYAGE_AI_API_KEY` if user has no key and provider matches instance default
|
||||
5. Key is passed to LiteLLM's `acompletion()`
|
||||
|
||||
Potential issues:
|
||||
- Encryption key not configured correctly
|
||||
- Key not being passed correctly to LiteLLM
|
||||
- Provider-specific auth headers not being set
|
||||
|
||||
### Issue 4: LiteLLM vs Alternatives
|
||||
Current approach (LiteLLM):
|
||||
- Single library handles all providers
|
||||
- Normalizes API calls across providers
|
||||
- Built-in error handling and retries (if configured)
|
||||
|
||||
Alternative (Vercel AI SDK):
|
||||
- Provider registry pattern with individual packages
|
||||
- More explicit provider configuration
|
||||
- Better TypeScript support
|
||||
- But would require significant refactoring (backend is Python)
|
||||
|
||||
## Investigation Tasks
|
||||
|
||||
- [ ] Test the actual API calls to verify authentication
|
||||
- [x] Check if models endpoint returns correct data
|
||||
- [x] Verify tool schemas are being passed correctly
|
||||
- [ ] Test with a known-working model (e.g., GPT-4o)
|
||||
|
||||
## Options
|
||||
|
||||
### Option A: Fix LiteLLM Integration (Recommended)
|
||||
1. Add proper retry logic with `num_retries=2`
|
||||
2. Add `supports_function_calling()` check before using tools
|
||||
3. Expand models endpoint to handle more providers
|
||||
4. Add better logging for debugging
|
||||
|
||||
### Option B: Replace LiteLLM with Custom Implementation
|
||||
1. Use direct API calls per provider
|
||||
2. More control but more maintenance
|
||||
3. Significant development effort
|
||||
|
||||
### Option C: Hybrid Approach
|
||||
1. Keep LiteLLM for providers it handles well
|
||||
2. Add custom handlers for problematic providers
|
||||
3. Medium effort, best of both worlds
|
||||
|
||||
## Status
|
||||
|
||||
### Completed (2026-03-09)
|
||||
- [x] Implemented backend fixes for Option A:
|
||||
1. `ChatProviderCatalogViewSet.models()` now fetches OpenCode Zen models dynamically from `{api_base}/models` using the configured provider API base and user API key; returns deduplicated model ids and logs fetch failures.
|
||||
2. `stream_chat_completion()` now checks `litellm.supports_function_calling(model=resolved_model)` before sending tools and disables tools with a warning if unsupported.
|
||||
3. Added LiteLLM transient retry configuration via `num_retries=2` on streaming completions.
|
||||
4. Added request/error logging for provider/model/tool usage and API base/message count diagnostics.
|
||||
|
||||
### Verification Results
|
||||
- Models endpoint: Returns 36 models from OpenCode Zen API (was 5 hardcoded)
|
||||
- Function calling check: gpt-5-nano=True, claude-sonnet-4-6=True, big-pickle=False, minimax-m2.5=False
|
||||
- Syntax check: Passed for both modified files
|
||||
- Frontend check: 0 errors, 6 warnings (pre-existing)
|
||||
|
||||
### Remaining Issues (User Action Required)
|
||||
- Rate limits: Free tier has limits, user may need to upgrade or wait
|
||||
- Tool calling: Some models (big-pickle, minimax-m2.5) don't support function calling - tools will be disabled for these models
|
||||
|
||||
## Follow-up Fixes (2026-03-09)
|
||||
|
||||
### Clarified Behavior
|
||||
- Approved preference precedence: database-saved default provider/model beats any per-device `localStorage` override.
|
||||
- Requirement: user AI preferences must be persisted through the existing `UserAISettings` backend API and applied by both the settings UI and chat send-message fallback logic.
|
||||
|
||||
### Planned Workstreams
|
||||
|
||||
- [x] `chat-loop-hardening`
|
||||
- Acceptance: invalid required-argument tool calls do not loop repeatedly, tool-error messages are not replayed back into the model history, and SSE streams terminate cleanly with a user-visible error or `[DONE]`.
|
||||
- Files: `backend/server/chat/views/__init__.py`, `backend/server/chat/agent_tools.py`, optional `backend/server/chat/llm_client.py`
|
||||
- Notes: preserve successful tool flows; stop feeding `{"error": "location is required"}` / `{"error": "query is required"}` back into the next model turn.
|
||||
- Completion (2026-03-09): Added required-argument tool-error detection in `send_message()` streaming loop, short-circuited those tool failures with a user-visible SSE error + terminal `[DONE]`, skipped persistence/replay of those invalid tool payloads (including historical cleanup at `_build_llm_messages()`), and tightened `search_places`/`web_search` tool descriptions to explicitly call out required non-empty args.
|
||||
- Follow-up (2026-03-09): Fixed multi-tool-call consistency by persisting/replaying only the successful prefix of `tool_calls` when a later call fails required-arg validation; `_build_llm_messages()` now trims assistant `tool_calls` to only IDs that have kept (non-filtered) persisted tool messages.
|
||||
- Review verdict (2026-03-09): **APPROVED** (score 6). Two WARNINGs: (1) multi-tool-call orphan — when model returns N tool calls and call K fails required-param validation, calls 1..K-1 are already persisted but call K's result is not, leaving an orphaned `tool_calls` reference in the assistant message that may cause LLM API errors on the next conversation turn; (2) `_build_llm_messages` filters tool-role error messages but does not filter/trim the corresponding assistant-message `tool_calls` array, creating the same orphan on historical replay. Both are low-likelihood (multi-tool required-param failures are rare) and gracefully degraded (next-turn errors are caught by `_safe_error_payload`). One SUGGESTION: `get_weather` error `"dates must be a non-empty list"` does not match the `is/are required` regex and would not trigger the short-circuit (mitigated by `MAX_TOOL_ITERATIONS` guard). Also confirms prior pre-existing bug (`tool_iterations` never incremented) is now fixed in this changeset.
|
||||
|
||||
- [x] `default-ai-settings`
|
||||
- Acceptance: settings page shows default AI provider/model controls, saving persists via `UserAISettings`, chat UI initializes from saved preferences, and backend chat fallback uses saved defaults when request payload omits provider/model.
|
||||
- Files: `frontend/src/routes/settings/+page.server.ts`, `frontend/src/routes/settings/+page.svelte`, `frontend/src/lib/types.ts`, `frontend/src/lib/components/AITravelChat.svelte`, `backend/server/chat/views/__init__.py`
|
||||
- Notes: DB-saved defaults override browser-local model prefs.
|
||||
|
||||
### Completion Note (2026-03-09)
|
||||
- Implemented DB-backed default AI settings end-to-end: settings page now loads/saves `UserAISettings` via `/api/integrations/ai-settings/`, with provider/model selectors powered by provider catalog + per-provider models endpoint.
|
||||
- Chat initialization now treats saved DB defaults as authoritative initial provider/model; stale `voyage_chat_model_prefs` localStorage values no longer override defaults and are synchronized to the saved defaults.
|
||||
- Backend `send_message` now uses saved `UserAISettings` only when request payload omits provider/model, preserving explicit request values and existing provider validation behavior.
|
||||
- Follow-up fix: backend model fallback now only applies `preferred_model` when the resolved provider matches `preferred_provider`, preventing cross-provider default model mismatches when users explicitly choose another provider.
|
||||
|
||||
- [x] `suggestion-add-flow`
|
||||
- Acceptance: day suggestions use the user-configured/default provider/model instead of hardcoded OpenAI values, and adding a suggested place creates a location plus itinerary entry successfully.
|
||||
- Files: `backend/server/chat/views/day_suggestions.py`, `frontend/src/lib/components/collections/ItinerarySuggestionModal.svelte`
|
||||
- Notes: normalize suggestion payloads needed by `/api/locations/` and preserve existing add-item event wiring.
|
||||
- Completion (2026-03-09): Day suggestions now resolve provider/model in precedence order (request payload → `UserAISettings` defaults → instance/provider defaults) without OpenAI hardcoding; modal now normalizes suggestion objects and builds stable `/api/locations/` payloads (name/location/description/rating) before dispatching existing `addItem` flow.
|
||||
- Follow-up (2026-03-09): Removed remaining OpenAI-specific `gpt-4o-mini` fallback from day suggestions LLM call; endpoint now uses provider-resolved/default model only and fails safely when no model is configured.
|
||||
- Follow-up (2026-03-09): Removed unsupported `temperature` from day suggestions requests, normalized bare `opencode_zen` model ids through the gateway (`openai/<model>`), and switched day suggestions error responses to the same sanitized categories used by chat. Browser result: the suggestion modal now completes normally (empty-state or rate-limit message) instead of crashing with a generic 500.
|
||||
|
||||
## Tester Validation — `default-ai-settings` (2026-03-09)
|
||||
|
||||
### STATUS: PASS
|
||||
|
||||
**Evidence from lead:** Authenticated POST `/api/integrations/ai-settings/` returned 200 and persisted; subsequent GET returned same values; POST `/api/chat/conversations/{id}/send_message/` with no provider/model in body used `preferred_provider='opencode_zen'` and `preferred_model='gpt-5-nano'` from DB, producing valid SSE stream.
|
||||
|
||||
**Standard pass findings:**
|
||||
- `UserAISettings` model, serializer, and `UserAISettingsViewSet` are correct. Upsert logic in `perform_create` handles first-write and update-in-place correctly (single row per user via OneToOneField).
|
||||
- `list()` returns `[serializer.data]` (wrapped array), which the frontend expects as `settings[0]` — contract matches.
|
||||
- Backend `send_message` precedence: `requested_provider` → `preferred_provider` (if available) → `"openai"` fallback. `model` only inherits `preferred_model` when `provider == preferred_provider` — cross-provider default mismatch is correctly prevented (follow-up fix confirmed).
|
||||
- Settings page initializes `defaultAiProvider`/`defaultAiModel` from SSR-loaded `aiSettings` and validates against provider catalog on `onMount`. If saved provider is no longer configured, it falls back to first configured provider.
|
||||
- `AITravelChat.svelte` fetches AI settings on mount, applies as authoritative default, and writes to `localStorage` (sync direction is DB → localStorage, not the reverse).
|
||||
- The `send_message` handler in the frontend always sends the current UI `selectedProvider`/`selectedModel`, not localStorage values directly — these are only used for UI state initialization, not bypassing DB defaults.
|
||||
- All i18n keys present in `en.json`: `default_ai_settings_title`, `default_ai_settings_desc`, `default_ai_no_providers`, `default_ai_save`, `default_ai_settings_saved`, `default_ai_settings_error`, `default_ai_provider_required`.
|
||||
- Django integration tests (5/5) pass; no tests exist for `UserAISettings` specifically — residual regression risk noted.
|
||||
|
||||
**Adversarial pass findings (all hypotheses did not find bugs):**
|
||||
|
||||
1. **Hypothesis: model saved for provider A silently applied when user explicitly sends provider B (cross-provider model leak).** Checked `send_message` lines 218–220: `model = requested_model; if model is None and preferred_model and provider == preferred_provider: model = preferred_model`. When `requested_provider=B` and `preferred_provider=A`, `provider == preferred_provider` is false → `model` stays `None`. **Not vulnerable.**
|
||||
|
||||
2. **Hypothesis: null/empty preferred_model or preferred_provider in DB triggers error.** Serializer allows `null` on both fields (CharField with `blank=True, null=True`). Backend normalizes with `.strip().lower()` inside `(ai_settings.preferred_provider or "").strip().lower()` guard. Frontend uses `?? ''` coercion. **Handled safely.**
|
||||
|
||||
3. **Hypothesis: second POST to `/api/integrations/ai-settings/` creates a second row instead of updating.** `UserAISettings` uses `OneToOneField(user, ...)` + `perform_create` explicitly fetches and updates existing row. A second POST cannot produce a duplicate. **Not vulnerable.**
|
||||
|
||||
4. **Hypothesis: initializeDefaultAiSettings silently overwrites the saved DB provider with the first catalog provider if the saved provider is temporarily unavailable (e.g., API key deleted).** Confirmed: line 119–121 does silently auto-select first available provider and blank the model if the saved provider is gone. This affects display only (not DB); the save action is still explicit. **Acceptable behavior; low risk.**
|
||||
|
||||
5. **Hypothesis: frontend sends `model: undefined` (vs `model: null`) when no model selected, causing backend to ignore it.** `requested_model = (request.data.get("model") or "").strip() or None` — if `undefined`/absent from JSON body, `get("model")` returns `None`, which becomes `None` after the guard. `model` variable falls through to default logic. **Works correctly.**
|
||||
|
||||
**MUTATION_ESCAPES: 1/8** — the regex `(is|are) required` in `_is_required_param_tool_error` (chat-loop-hardening code) would escape if a future required-arg error used a different pattern, but this is unrelated to `default-ai-settings` scope.
|
||||
|
||||
**Zero automated test coverage for `UserAISettings` CRUD + precedence logic.** Backend logic is covered only by the lead's live-run evidence. Recommended follow-up: add Django TestCase covering (a) upsert idempotency, (b) provider/model precedence in `send_message`, (c) cross-provider model guard.
|
||||
|
||||
## Tester Validation — `chat-loop-hardening` (2026-03-09)
|
||||
|
||||
### STATUS: PASS
|
||||
|
||||
**Evidence from lead (runtime):** Authenticated POST to `send_message` with patched upstream stream emitting `search_places {}` (missing required `location`) returned status 200, SSE body `data: {"tool_calls": [...]}` → `data: {"error": "...", "error_category": "tool_validation_error"}` → `data: [DONE]`. Persisted DB state after that turn: only `('user', None, 'restaurants please')` + `('assistant', None, '')` — no invalid `role=tool` error row.
|
||||
|
||||
**Standard pass findings:**
|
||||
|
||||
- `_is_required_param_tool_error`: correctly matches `location is required`, `query is required`, `collection_id is required`, `collection_id, name, latitude, and longitude are required`, `latitude and longitude are required`. Does NOT match non-required-arg errors (`dates must be a non-empty list`, `Trip not found`, `Unknown tool: foo`, etc.). All 18 test cases pass.
|
||||
- `_is_required_param_tool_error_message_content`: correctly parses JSON-wrapped content from persisted DB rows and delegates to above. Handles non-JSON, non-dict JSON, and `error: null` safely. All 7 test cases pass.
|
||||
- Orphan trimming in `_build_llm_messages`: when assistant has `tool_calls=[A, B]` and B's persisted tool row contains a required-param error, the rebuilt `assistant.tool_calls` retains only `[A]` and tool B's row is filtered. Verified for both the multi-tool case and the single-tool (lead's runtime) scenario.
|
||||
- SSE stream terminates with `data: [DONE]` immediately after the `tool_validation_error` event — confirmed by code path at line 425–426 which `return`s the generator.
|
||||
- `MAX_TOOL_ITERATIONS = 10` correctly set; `tool_iterations` counter is incremented on each tool iteration (pre-existing bug confirmed fixed).
|
||||
- `_merge_tool_call_delta` handles `None`, `[]`, missing `index`, and malformed argument JSON without crash.
|
||||
- Full Django test suite: 24/30 pass; 6/30 fail (all pre-existing: 2 user email key errors + 4 geocoding API mock errors). Zero regressions introduced by this changeset.
|
||||
|
||||
**Adversarial pass findings:**
|
||||
|
||||
1. **Hypothesis: `get_weather` with empty `dates=[]` bypasses short-circuit and loops.** `get_weather` returns `{"error": "dates must be a non-empty list"}` which does NOT match the `is/are required` regex → not short-circuited. Falls through to `MAX_TOOL_ITERATIONS` guard (10 iterations max). **Known gap, mitigated by guard — confirmed matches reviewer WARNING.**
|
||||
|
||||
2. **Hypothesis: regex injection via crafted error text creates false-positive short-circuit.** Tested `'x is required; rm -rf /'` (semicolon breaks `fullmatch`), newline injection, Cyrillic lookalike. All return `False` correctly. **Not vulnerable.**
|
||||
|
||||
3. **Hypothesis: `assistant.tool_calls=[]` (empty list) pollutes rebuilt messages.** `filtered_tool_calls` is `[]` → the `if filtered_tool_calls:` guard prevents empty `tool_calls` key from being added to the payload. **Not vulnerable.**
|
||||
|
||||
4. **Hypothesis: `tool message content = None` is incorrectly classified as required-param error.** `_is_required_param_tool_error_message_content(None)` returns `False` (not a string → returns early). **Not vulnerable.**
|
||||
|
||||
5. **Hypothesis: `_build_required_param_error_event` crashes on None/missing `result`.** `result.get("error")` is guarded by `if isinstance(result, dict)` in caller; the static method itself handles `None` result via `isinstance` check and produces `error=""`. **No crash.**
|
||||
|
||||
6. **Hypothesis: multi-tool scenario — only partial `tool_calls` prefix trimmed correctly.** Tested assistant with `[A, B]` where A succeeds and B fails: rebuilt messages contain `tool_calls=[A]` only. Tested assistant with only `[X]` failing: rebuilt messages contain `tool_calls=None` (key absent). **Both correct.**
|
||||
|
||||
**MUTATION_ESCAPES: 1/7** — `get_weather` returning `"dates must be a non-empty list"` not triggering the short-circuit. This is a known, reviewed, accepted gap (mitigated by `MAX_TOOL_ITERATIONS`). No other mutation checks escaped detection.
|
||||
|
||||
**FLAKY: 0**
|
||||
|
||||
**COVERAGE: N/A** — no automated test suite exists for the `chat` app; all validation is via unit-level method tests + lead's live-run evidence. Recommended follow-up: add Django `TestCase` for `send_message` streaming loop covering (a) single required-arg tool failure → short-circuit, (b) multi-tool partial success, (c) `MAX_TOOL_ITERATIONS` exhaustion, (d) `_build_llm_messages` orphan-trimming round-trip.
|
||||
|
||||
## Tester Validation — `suggestion-add-flow` (2026-03-09)
|
||||
|
||||
### STATUS: PASS
|
||||
|
||||
**Test run:** 30 Django tests (24 pass, 6 fail — all 6 pre-existing: 2 user email key errors + 4 geocoding mock failures). Zero new regressions. 44 targeted unit-level checks (42 pass, 2 fail — both failures confirmed as test-script defects, not code bugs).
|
||||
|
||||
**Standard pass findings:**
|
||||
|
||||
- `_resolve_provider_and_model` precedence verified end-to-end: explicit request payload → `UserAISettings.preferred_provider/model` → `settings.VOYAGE_AI_PROVIDER/MODEL` → provider-config default. All 4 precedence levels tested and confirmed correct.
|
||||
- Cross-provider model guard confirmed: when request provider ≠ `preferred_provider`, the `preferred_model` is NOT applied (prevents `gpt-5-nano` from leaking to anthropic, etc.).
|
||||
- Null/empty `preferred_provider`/`preferred_model` in `UserAISettings` handled safely (`or ""` coercion guards throughout).
|
||||
- JSON parsing in `_get_suggestions_from_llm` is robust: handles clean JSON array, embedded JSON in prose, markdown-wrapped JSON, plain text (no JSON), empty string, `None` content — all return correct results (empty list or parsed list). Response capped at 5 items. Single-dict LLM response wrapped in list correctly.
|
||||
- `normalizeSuggestionItem` normalization verified: non-dict returns `null`, missing name+location returns `null`, field aliases (`title`→`name`, `address`→`location`, `summary`→`description`, `score`→`rating`, `whyFits`→`why_fits`) all work. Whitespace-only name falls back to location.
|
||||
- `rating=0` correctly preserved in TypeScript via `??` (nullish coalescing at line 171), not dropped. The Python port used `or` which drops `0`, but that's a test-script defect only.
|
||||
- `buildLocationPayload` constructs a valid `LocationSerializer`-compatible payload: `name`, `location`, `description`, `rating`, `collections`, `is_public`. Falls back to collection location when suggestion has none.
|
||||
- `handleAddSuggestion` → POST `/api/locations/` → `dispatch('addItem', {type:'location', itemId, updateDate:false})` wiring confirmed by code inspection (lines 274–294). Parent `CollectionItineraryPlanner` handler at line 2626 calls `addItineraryItemForObject`.
|
||||
|
||||
**Adversarial pass findings:**
|
||||
|
||||
1. **Hypothesis: cross-provider model leak (gpt-5-nano applied to anthropic).** Tested `request.provider=anthropic` + `UserAISettings.preferred_provider=opencode_zen`, `preferred_model=gpt-5-nano`. Result: `model_from_user_defaults=None` (because `provider != preferred_provider`). **Not vulnerable.**
|
||||
|
||||
2. **Hypothesis: null/empty DB prefs cause exceptions.** `preferred_provider=None`, `preferred_model=None` — all guards use `(value or "").strip()` pattern. Falls through to `settings.VOYAGE_AI_PROVIDER` safely. **Not vulnerable.**
|
||||
|
||||
3. **Hypothesis: all-None provider/model/settings causes exception in `_resolve_provider_and_model`.** Tested with `is_chat_provider_available=False` everywhere, all settings None. Returns `(None, None)` without exception; caller checks `is_chat_provider_available(provider)` and returns 503. **Not vulnerable.**
|
||||
|
||||
4. **Hypothesis: missing API key causes silent empty result instead of error.** `get_llm_api_key` returns `None` → raises `ValueError("No API key available")` → caught by `post()` try/except → returns 500. **Explicit error path confirmed.**
|
||||
|
||||
5. **Hypothesis: no model configured causes silent failure.** `model=None` + empty `provider_config` → raises `ValueError("No model configured for provider")` → 500. **Explicit error path confirmed.**
|
||||
|
||||
6. **Hypothesis: `normalizeSuggestionItem` with mixed array (nulls, strings, invalid dicts).** `[None, {name:'A'}, 'string', {description:'only'}, {name:'B'}]` → after normalize+filter: 2 valid items. **Correct.**
|
||||
|
||||
7. **Hypothesis: rating=0 dropped by falsy check.** Actual TS uses `item.rating ?? item.score` (nullish coalescing, not `||`). `normalizeRating(0)` returns `0` (finite number check). **Not vulnerable in actual code.**
|
||||
|
||||
8. **Hypothesis: XSS in name field.** `<script>alert(1)</script>` passes through as a string; Django serializer stores as text, template rendering escapes it. **Not vulnerable.**
|
||||
|
||||
9. **Hypothesis: double-click `handleAddSuggestion` creates duplicate location.** `isAdding` guard at line 266 exits early if `isAdding` is truthy — prevents re-entrancy. **Protected by UI-state guard.**
|
||||
|
||||
**Known low-severity defect (pre-existing, not introduced by this workstream):** LLM-generated `name`/`location` fields are not truncated before passing to `LocationSerializer` (max_length=200). If LLM returns a name > 200 chars, the POST to `/api/locations/` returns 400 and the frontend shows a generic error. Risk is very low in practice (LLM names are short). Recommended fix: add `.slice(0, 200)` in `buildLocationPayload` for `name` and `location` fields.
|
||||
|
||||
**MUTATION_ESCAPES: 1/9** — `rating=0` would escape mutation detection in naive Python tests (but is correctly handled in the actual TS `??` code). No logic mutations escape in the backend Python code.
|
||||
|
||||
**FLAKY: 0**
|
||||
|
||||
**COVERAGE: N/A** — no automated suite for `chat` or `suggestions` app. All validation via unit-level method tests + provider/model resolution checks. Recommended follow-up: add Django `TestCase` for `DaySuggestionsView.post()` covering (a) missing required fields → 400, (b) invalid category → 400, (c) unauthorized collection → 403, (d) provider unavailable → 503, (e) LLM exception → 500, (f) happy path → 200 with `suggestions` array.
|
||||
|
||||
**Cleanup required:** Two test artifact files left on host (not git-tracked, safe to delete):
|
||||
- `/home/alex/projects/voyage/test_suggestion_flow.py`
|
||||
- `/home/alex/projects/voyage/suggestion-modal-error-state.png`
|
||||
401
.memory/plans/opencode-zen-connection-error.md
Normal file
401
.memory/plans/opencode-zen-connection-error.md
Normal file
@@ -0,0 +1,401 @@
|
||||
# Plan: Fix OpenCode Zen connection errors in AI travel chat
|
||||
|
||||
## Clarified requirements
|
||||
- User configured provider `opencode_zen` in Settings with API key.
|
||||
- Chat attempts return a generic connection error.
|
||||
- Goal: identify root cause and implement a reliable fix for OpenCode Zen chat connectivity.
|
||||
- Follow-up: add model selection in chat composer (instead of forced default model) and persist chosen model per user.
|
||||
|
||||
## Acceptance criteria
|
||||
- Sending a chat message with provider `opencode_zen` no longer fails with a connection error due to Voyage integration/configuration.
|
||||
- Backend provider routing for `opencode_zen` uses a validated OpenAI-compatible request shape and model format.
|
||||
- Frontend surfaces backend/provider errors with actionable detail (not only generic connection failure) when available.
|
||||
- Validation commands run successfully (or known project-expected failures only) and results recorded.
|
||||
|
||||
## Tasks
|
||||
- [ ] Discovery: inspect current OpenCode Zen provider configuration and chat request pipeline (Agent: explorer)
|
||||
- [ ] Discovery: verify OpenCode Zen API compatibility requirements vs current implementation (Agent: researcher)
|
||||
- [ ] Discovery: map model-selection edit points and persistence path (Agent: explorer)
|
||||
- [x] Implement fix for root cause + model selection/persistence (Agent: coder)
|
||||
- [x] Correctness review of targeted changes (Agent: reviewer) — APPROVED (score 0)
|
||||
- [x] Standard validation run and targeted chat-path checks (Agent: tester)
|
||||
- [x] Documentation and knowledge sync for provider troubleshooting notes (Agent: librarian)
|
||||
|
||||
## Researcher findings
|
||||
|
||||
**Root cause**: Two mismatches in `backend/server/chat/llm_client.py` lines 59-64:
|
||||
|
||||
1. **Invalid model ID** — `default_model: "openai/gpt-4o-mini"` does not exist on OpenCode Zen. Zen has its own model catalog (gpt-5-nano, glm-5, kimi-k2.5, etc.). Sending `gpt-4o-mini` to the Zen API results in a model-not-found error.
|
||||
2. **Endpoint routing** — GPT models on Zen use `/responses` endpoint, but LiteLLM's `openai/` prefix routes through the OpenAI Python client which appends `/chat/completions`. The `/chat/completions` endpoint only works for OpenAI-compatible models (GLM, Kimi, MiniMax, Qwen, Big Pickle).
|
||||
|
||||
**Error flow**: LiteLLM exception → caught by generic handler at line 274 → yields `"An error occurred while processing your request"` SSE → frontend shows either this message or falls back to `$t('chat.connection_error')`.
|
||||
|
||||
**Recommended fix** (primary — `llm_client.py:62`):
|
||||
- Change `"default_model": "openai/gpt-4o-mini"` → `"openai/gpt-5-nano"` (free model, confirmed to work via `/chat/completions` by real-world usage in multiple repos)
|
||||
|
||||
**Secondary fix** (error surfacing — `llm_client.py:274-276`):
|
||||
- Extract meaningful error info from LiteLLM exceptions (status_code, message) instead of swallowing all details into a generic message
|
||||
|
||||
Full analysis: [research/opencode-zen-connection-debug.md](../research/opencode-zen-connection-debug.md)
|
||||
|
||||
## Retry tracker
|
||||
- OpenCode Zen connection fix task: 0
|
||||
|
||||
## Implementation checkpoint (coder)
|
||||
|
||||
- Added composer-level model selection + per-provider browser persistence in `frontend/src/lib/components/AITravelChat.svelte` using localStorage key `voyage_chat_model_prefs`.
|
||||
- Added `chat.model_label` and `chat.model_placeholder` i18n keys in `frontend/src/locales/en.json`.
|
||||
- Extended `send_message` backend intake in `backend/server/chat/views.py` to read optional `model` (`empty -> None`) and pass it to streaming.
|
||||
- Updated `backend/server/chat/llm_client.py` to:
|
||||
- switch `opencode_zen` default model to `openai/gpt-5-nano`,
|
||||
- accept optional `model` override in `stream_chat_completion(...)`,
|
||||
- apply safe provider/model compatibility guard (skip strict prefix check for custom `api_base` gateways),
|
||||
- map known LiteLLM exception classes to sanitized user-safe error categories/messages,
|
||||
- include `tools` / `tool_choice` kwargs only when tools are present.
|
||||
|
||||
See related analysis in [research notes](../research/opencode-zen-connection-debug.md#model-selection-implementation-map).
|
||||
|
||||
---
|
||||
|
||||
## Explorer findings (model selection)
|
||||
|
||||
**Date**: 2026-03-08
|
||||
**Full detail**: [research/opencode-zen-connection-debug.md — Model selection section](../research/opencode-zen-connection-debug.md#model-selection-implementation-map)
|
||||
|
||||
### Persistence decision: `localStorage` (no migration)
|
||||
|
||||
**Recommended**: store `{ [provider_id]: model_string }` in `localStorage` key `voyage_chat_model_prefs`.
|
||||
|
||||
Rationale:
|
||||
- No existing per-user model preference field anywhere in DB/API
|
||||
- Adding a DB column to `CustomUser` requires a migration + serializer + API change → 4+ files
|
||||
- `UserAPIKey` stores only encrypted API keys (not preferences)
|
||||
- Model preference is UI-volatile (the model catalog changes; stale DB entries require cleanup)
|
||||
- `localStorage` is already used elsewhere in the frontend for similar ephemeral UI state
|
||||
- Model preference is not sensitive; persisting client-side is consistent with how the provider selector already works (no backend persistence either)
|
||||
- **No migration required** for localStorage approach
|
||||
|
||||
### File-by-file edit plan (exact symbols)
|
||||
|
||||
#### Backend: `backend/server/chat/llm_client.py`
|
||||
- `stream_chat_completion(user, messages, provider, tools=None)` → add `model: str | None = None` parameter
|
||||
- Line 226: `"model": provider_config["default_model"]` → `"model": model or provider_config["default_model"]`
|
||||
- Add validation: if `model` is not `None`, check it starts with a valid LiteLLM provider prefix (or matches a known-safe pattern); reject bare model strings that don't include provider prefix
|
||||
|
||||
#### Backend: `backend/server/chat/views.py`
|
||||
- `send_message()` (line 104): extract `model = (request.data.get("model") or "").strip() or None`
|
||||
- Pass `model=model` to `stream_chat_completion()` call (line 144)
|
||||
- Add validation: if `model` is provided, confirm it belongs to the same provider family (prefix check); return 400 if mismatch
|
||||
|
||||
#### Frontend: `frontend/src/lib/types.ts`
|
||||
- No change needed — `ChatProviderCatalogEntry.default_model` already exists
|
||||
|
||||
#### Frontend: `frontend/src/lib/components/AITravelChat.svelte`
|
||||
- Add `let selectedModel: string = ''` (reset when provider changes)
|
||||
- Add reactive: `$: selectedProviderEntry = chatProviders.find(p => p.id === selectedProvider) ?? null`
|
||||
- Add reactive: `$: { if (selectedProviderEntry) { selectedModel = loadModelPref(selectedProvider) || selectedProviderEntry.default_model || ''; } }`
|
||||
- `sendMessage()` line 121: body `{ message: msgText, provider: selectedProvider }` → `{ message: msgText, provider: selectedProvider, model: selectedModel }`
|
||||
- Add model input field in the composer toolbar (near provider `<select>`, line 290-299): `<input type="text" class="input input-bordered input-sm" bind:value={selectedModel} placeholder={selectedProviderEntry?.default_model ?? ''} />`
|
||||
- Add `loadModelPref(provider)` / `saveModelPref(provider, model)` functions using `localStorage` key `voyage_chat_model_prefs`
|
||||
- Add `$: saveModelPref(selectedProvider, selectedModel)` reactive to persist on change
|
||||
|
||||
#### Frontend: `frontend/src/locales/en.json`
|
||||
- Add `"chat.model_label"`: `"Model"` (label for model input)
|
||||
- Add `"chat.model_placeholder"`: `"Default model"` (placeholder when empty)
|
||||
|
||||
### Validation constraints / risks
|
||||
|
||||
1. **Model-provider prefix mismatch**: `stream_chat_completion` uses `provider_config["default_model"]` prefix to route via LiteLLM. If user passes `openai/gpt-5-nano` for the `anthropic` provider, LiteLLM will try to call OpenAI with Anthropic credentials. Backend must validate that the supplied model string starts with the expected provider prefix or reject it.
|
||||
2. **Free-text model field**: No enumeration from backend; user types any string. Validation (prefix check) is the only guard.
|
||||
3. **localStorage staleness**: If a provider removes a model, the stored preference produces a LiteLLM error — the error surfacing fix (Fix #2 in existing plan) makes this diagnosable.
|
||||
4. **Empty string vs null**: Frontend should send `model: selectedModel || undefined` (omit key if empty) to preserve backend default behavior.
|
||||
|
||||
### No migration required
|
||||
All backend changes are parameter additions to existing function signatures + optional request field parsing. No DB schema changes.
|
||||
|
||||
---
|
||||
|
||||
## Explorer findings
|
||||
|
||||
**Date**: 2026-03-08
|
||||
**Detail**: Full trace in [research/opencode-zen-connection-debug.md](../research/opencode-zen-connection-debug.md)
|
||||
|
||||
### End-to-end path (summary)
|
||||
|
||||
```
|
||||
AITravelChat.svelte:sendMessage()
|
||||
POST /api/chat/conversations/<id>/send_message/ { message, provider:"opencode_zen" }
|
||||
→ +server.ts:handleRequest() [CSRF refresh + proxy, SSE passthrough lines 94-98]
|
||||
→ views.py:ChatViewSet.send_message() [validates provider, saves user msg]
|
||||
→ llm_client.py:stream_chat_completion() [builds kwargs, calls litellm.acompletion]
|
||||
→ litellm.acompletion(model="openai/gpt-4o-mini", api_base="https://opencode.ai/zen/v1")
|
||||
→ POST https://opencode.ai/zen/v1/chat/completions ← FAILS: model not on Zen
|
||||
→ except Exception at line 274 → data:{"error":"An error occurred..."}
|
||||
← frontend shows error string inline (or "Connection error." on network failure)
|
||||
```
|
||||
|
||||
### Ranked root causes confirmed by code trace
|
||||
|
||||
1. **[CRITICAL] Wrong default model** (`openai/gpt-4o-mini` is not a Zen model)
|
||||
- `backend/server/chat/llm_client.py:62`
|
||||
- Fix: change to `"openai/gpt-5-nano"` (free, confirmed OpenAI-compat via `/chat/completions`)
|
||||
|
||||
2. **[SIGNIFICANT] Generic exception handler masks provider errors**
|
||||
- `backend/server/chat/llm_client.py:274-276`
|
||||
- Bare `except Exception:` swallows LiteLLM structured exceptions (NotFoundError, AuthenticationError, etc.)
|
||||
- Fix: extract `exc.status_code` / `exc.message` and forward to SSE error payload
|
||||
|
||||
3. **[SIGNIFICANT] WSGI + per-request event loop for async LiteLLM**
|
||||
- Backend runs **Gunicorn WSGI** (`supervisord.conf:11`); no ASGI entry point exists
|
||||
- `views.py:66-76` `_async_to_sync_generator` creates `asyncio.new_event_loop()` per request
|
||||
- LiteLLM httpx sessions may not be compatible with per-call new loops → potential connection errors on the second+ tool iteration
|
||||
- Fix: wrap via `asyncio.run()` or migrate to ASGI (uvicorn)
|
||||
|
||||
4. **[MINOR] `tool_choice: None` / `tools: None` passed as kwargs when unused**
|
||||
- `backend/server/chat/llm_client.py:227-229`
|
||||
- Fix: conditionally include keys only when tools are present
|
||||
|
||||
5. **[MINOR] Synchronous ORM call inside async generator**
|
||||
- `backend/server/chat/llm_client.py:217` — `get_llm_api_key()` calls `UserAPIKey.objects.get()` synchronously
|
||||
- Fine under WSGI+new-event-loop but technically incorrect for async context
|
||||
- Fix: wrap with `sync_to_async` or move key lookup before entering async boundary
|
||||
|
||||
### Minimal edit points for a fix
|
||||
|
||||
| Priority | File | Location | Change |
|
||||
|---|---|---|---|
|
||||
| 1 (required) | `backend/server/chat/llm_client.py` | line 62 | `"default_model": "openai/gpt-5-nano"` |
|
||||
| 2 (recommended) | `backend/server/chat/llm_client.py` | lines 274-276 | Extract `exc.status_code`/`exc.message` for user-facing error |
|
||||
| 3 (recommended) | `backend/server/chat/llm_client.py` | lines 225-234 | Only include `tools`/`tool_choice` keys when tools are provided |
|
||||
|
||||
---
|
||||
|
||||
## Critic gate
|
||||
|
||||
**VERDICT**: APPROVED
|
||||
**Date**: 2026-03-08
|
||||
**Reviewer**: critic agent
|
||||
|
||||
### Rationale
|
||||
|
||||
The plan is well-scoped, targets a verified root cause with clear code references, and all three changes are in a single file (`llm_client.py`) within the same request path. This is a single coherent bug fix, not a multi-feature plan — no decomposition required.
|
||||
|
||||
### Assumption challenges
|
||||
|
||||
1. **`gpt-5-nano` validity on Zen** — The researcher claims this model is confirmed via GitHub usage patterns, but there is no live API verification. The risk is mitigated by Fix #2 (error surfacing), which would make any remaining model mismatch immediately diagnosable. **Accepted with guardrail**: coder must add a code comment noting the model was chosen based on research, and tester must verify the error path produces a meaningful message if the model is still wrong.
|
||||
|
||||
2. **`@mdi/js` build failure is NOT a baseline issue** — `@mdi/js` is a declared dependency in `package.json:44` but `node_modules/` is absent in this worktree. Running `bun install` will resolve this. **Guardrail**: Coder must run `bun install` before the validation pipeline; do not treat this as a known/accepted failure.
|
||||
|
||||
3. **Error surfacing may leak sensitive info** — Forwarding raw `exc.message` from LiteLLM exceptions could expose `api_base` URLs, internal config, or partial request data. Prior security review (decisions.md:103) already flagged `api_base` leakage as unnecessary. **Guardrail**: The error surfacing fix must sanitize exception messages — use only `exc.status_code` and a generic category (e.g., "authentication error", "model not found", "rate limit exceeded"), NOT raw `exc.message`. Map known LiteLLM exception types to safe user-facing descriptions.
|
||||
|
||||
### Scope guardrails for implementation
|
||||
|
||||
1. **In scope**: Fixes #1, #2, #3 from the plan table (model name, error surfacing, tool_choice cleanup) — all in `backend/server/chat/llm_client.py`.
|
||||
2. **Out of scope**: Fix #3 from Explorer findings (WSGI→ASGI migration), Fix #5 (sync_to_async ORM). These are structural improvements, not root cause fixes.
|
||||
3. **No frontend changes** unless the error message format changes require corresponding updates to `AITravelChat.svelte` parsing — verify and include only if needed.
|
||||
4. **Error surfacing must sanitize**: Map LiteLLM exception classes (`NotFoundError`, `AuthenticationError`, `RateLimitError`, `BadRequestError`) to safe user-facing categories. Do NOT forward raw `exc.message` or `str(exc)`.
|
||||
5. **Validation**: Run `bun install` first, then full pre-commit checklist (`format`, `lint`, `check`, `build`). Backend `manage.py check` must pass. If possible, manually test the chat SSE error path with a deliberately bad model name to confirm error surfacing works.
|
||||
6. **No new dependencies, no migrations, no schema changes** — none expected and none permitted for this fix.
|
||||
|
||||
---
|
||||
|
||||
## Reviewer security verdict
|
||||
|
||||
**VERDICT**: APPROVED
|
||||
**LENS**: Security
|
||||
**REVIEW_SCORE**: 3
|
||||
**Date**: 2026-03-08
|
||||
|
||||
### Security goals evaluated
|
||||
|
||||
| Goal | Status | Evidence |
|
||||
|---|---|---|
|
||||
| 1. Error handling doesn't leak secrets/api_base/raw internals | ✅ PASS | `_safe_error_payload()` maps exception classes to hardcoded user-safe strings; no `str(exc)`, `exc.message`, or `exc.args` forwarded. Logger.exception at line 366 is server-side only. Critic guardrail (decisions.md:189) fully satisfied. |
|
||||
| 2. Model override input can't bypass provider constraints dangerously | ✅ PASS | Model string used only as JSON field in `litellm.acompletion()` kwargs. No SQL, no shell, no eval, no path traversal. `_is_model_override_compatible()` validates prefix for standard providers. Gateway providers (`api_base` set) skip prefix check — correct by design, worst case is provider returns an error caught by sanitized handler. |
|
||||
| 3. No auth/permission regressions in send_message | ✅ PASS | `IsAuthenticated` + `get_queryset(user=self.request.user)` unchanged. New `model` param is additive-only, doesn't bypass existing validation. Tool execution scopes all DB queries to `user=user`. |
|
||||
| 4. localStorage stores no sensitive values | ✅ PASS | Key `voyage_chat_model_prefs` stores `{provider_id: model_string}` only. SSR-safe guards present. Try/catch on JSON parse/write. |
|
||||
|
||||
### Findings
|
||||
|
||||
**CRITICAL**: (none)
|
||||
|
||||
**WARNINGS**:
|
||||
- `[llm_client.py:194,225]` `api_base` field exposed in provider catalog response to frontend — pre-existing from prior consolidated review (decisions.md:103), not newly introduced. Server-defined constants only (not user-controllable), no SSRF. Frontend type includes field but never renders or uses it. (confidence: MEDIUM)
|
||||
|
||||
**SUGGESTIONS**:
|
||||
1. Consider adding a `max_length` check on the `model` parameter in `views.py:114` (e.g., reject if >200 chars) as defense-in-depth against pathological inputs, though Django's request size limits provide a baseline guard.
|
||||
2. Consider omitting `api_base` from the provider catalog response to frontend since the frontend never uses this value (pre-existing — tracked since prior security review).
|
||||
|
||||
### Prior findings cross-check
|
||||
- **Critic guardrail** (decisions.md:119-123 — "Error surfacing must NOT forward raw exc.message"): **CONFIRMED** — implementation uses class-based dispatch to hardcoded strings.
|
||||
- **Prior security review** (decisions.md:98-115 — api_base exposure, provider validation, IDOR checks): **CONFIRMED** — all findings still valid, no regressions.
|
||||
- **Explorer model-provider prefix mismatch warning** (plan lines 108-109): **CONFIRMED** — `_is_model_override_compatible()` implements the recommended validation.
|
||||
|
||||
### Tracker states
|
||||
- [x] Security goal 1: sanitized error handling (PASS)
|
||||
- [x] Security goal 2: model override safety (PASS)
|
||||
- [x] Security goal 3: auth/permission integrity (PASS)
|
||||
- [x] Security goal 4: localStorage safety (PASS)
|
||||
|
||||
---
|
||||
|
||||
## Reviewer correctness verdict
|
||||
|
||||
**VERDICT**: APPROVED
|
||||
**LENS**: Correctness
|
||||
**REVIEW_SCORE**: 0
|
||||
**Date**: 2026-03-08
|
||||
|
||||
### Requirements verification
|
||||
|
||||
| Requirement | Status | Evidence |
|
||||
|---|---|---|
|
||||
| Chat composer model selection | ✅ PASS | `AITravelChat.svelte:346-353` — text input bound to `selectedModel`, placed in composer header next to provider selector. Disabled when no providers available. |
|
||||
| Per-provider browser persistence | ✅ PASS | `loadModelPref`/`saveModelPref` (lines 60-92) use `localStorage` key `voyage_chat_model_prefs`. Provider change loads saved preference via `initializedModelProvider` sentinel (lines 94-98). User edits auto-save via reactive block (lines 100-102). JSON parse errors caught. SSR guards present. |
|
||||
| Optional model passed to backend | ✅ PASS | Frontend sends `model: selectedModel.trim() || undefined` (line 173). Backend extracts `model = (request.data.get("model") or "").strip() or None` (views.py:114). Passed as `model=model` to `stream_chat_completion` (views.py:150). |
|
||||
| Model used as override in backend | ✅ PASS | `completion_kwargs["model"] = model or provider_config["default_model"]` (llm_client.py:316). Null/empty correctly falls back to provider default. |
|
||||
| No regressions in provider selection/send flow | ✅ PASS | Provider selection, validation, SSE streaming all unchanged except additive `model` param. Error field format compatible with existing frontend parsing (`parsed.error` at line 210). |
|
||||
| Error category mapping coherent with frontend | ✅ PASS | Backend `_safe_error_payload` returns `{"error": "...", "error_category": "..."}`. Frontend checks `parsed.error` (human-readable string) and displays it. `error_category` available for future programmatic use. HTTP 400 errors also use `err.error` pattern (lines 177-183). |
|
||||
|
||||
### Correctness checklist
|
||||
|
||||
- **Off-by-one**: N/A — no index arithmetic in changes.
|
||||
- **Null/undefined dereference**: `selectedProviderEntry?.default_model ?? ''` and `|| $t(...)` — null-safe. Backend `model or provider_config["default_model"]` — None-safe.
|
||||
- **Ignored errors**: `try/catch` in `loadModelPref`/`saveModelPref` returns safe defaults. Backend exception handler maps to user-facing messages.
|
||||
- **Boolean logic**: Reactive guard `initializedModelProvider !== selectedProvider` correctly gates initialization vs save paths.
|
||||
- **Async/await**: No new async code in frontend. Backend `model` param is synchronous extraction before async boundary.
|
||||
- **Race conditions**: None introduced — `selectedModel` is single-threaded Svelte state.
|
||||
- **Resource leaks**: None — localStorage access is synchronous and stateless.
|
||||
- **Unsafe defaults**: Model defaults to provider's `default_model` when empty — safe.
|
||||
- **Dead/unreachable branches**: Pre-existing `tool_iterations` (views.py:139-141, never incremented) — not introduced by this change.
|
||||
- **Contract violations**: Function signature `stream_chat_completion(user, messages, provider, tools=None, model=None)` matches all call sites. `_is_model_override_compatible` return type is bool, used correctly in conditional.
|
||||
- **Reactive loop risk**: Verified — `initializedModelProvider` sentinel prevents re-entry between Block 1 (load) and Block 2 (save). `saveModelPref` has no state mutations → no cascading reactivity.
|
||||
|
||||
### Findings
|
||||
|
||||
**CRITICAL**: (none)
|
||||
**WARNINGS**: (none)
|
||||
|
||||
**SUGGESTIONS**:
|
||||
1. `[AITravelChat.svelte:100-102]` Save-on-every-keystroke reactive block calls `saveModelPref` on each character typed. Consider debouncing or saving on blur/submit to reduce localStorage churn.
|
||||
2. `[llm_client.py:107]` `getattr(exceptions, "NotFoundError", tuple())` — `isinstance(exc, ())` is always False by design (graceful fallback). A brief inline comment would clarify intent for future readers.
|
||||
|
||||
### Prior findings cross-check
|
||||
- **Critic gate guardrails** (decisions.md:117-124): All 3 guardrails confirmed followed (sanitized errors, `bun install` prerequisite, WSGI migration out of scope).
|
||||
- **`opencode_zen` default model**: Changed from `openai/gpt-4o-mini` → `openai/gpt-5-nano` as prescribed by researcher findings.
|
||||
- **`api_base` catalog exposure** (decisions.md:103): Pre-existing, unchanged by this change.
|
||||
- **`tool_iterations` dead guard** (decisions.md:91): Pre-existing, not affected by this change.
|
||||
|
||||
### Tracker states
|
||||
- [x] Correctness goal 1: model selection end-to-end (PASS)
|
||||
- [x] Correctness goal 2: per-provider persistence (PASS)
|
||||
- [x] Correctness goal 3: model override to backend (PASS)
|
||||
- [x] Correctness goal 4: no provider/send regressions (PASS)
|
||||
- [x] Correctness goal 5: error mapping coherence (PASS)
|
||||
|
||||
---
|
||||
|
||||
## Tester verdict (standard + adversarial)
|
||||
|
||||
**STATUS**: PASS
|
||||
**PASS**: Both (Standard + Adversarial)
|
||||
**Date**: 2026-03-08
|
||||
|
||||
### Commands run
|
||||
|
||||
| Command | Result |
|
||||
|---|---|
|
||||
| `docker compose exec server python3 manage.py check` | PASS — 0 issues (1 silenced, expected) |
|
||||
| `bun run check` (frontend) | PASS — 0 errors, 6 warnings (all pre-existing in `CollectionRecommendationView.svelte` + `RegionCard.svelte`, not in changed files) |
|
||||
| `docker compose exec server python3 manage.py test --keepdb` | 30 tests found; pre-existing failures: 2 user tests (email field key error) + 4 geocoding tests (Google API mock) = 6 failures (matches documented "2/3 fail" baseline). No regressions. |
|
||||
| Chat module static path validation (Django context) | PASS — all 5 targeted checks |
|
||||
| `bun run build` | Vite compilation PASS (534 modules SSR, 728 client). EACCES error on `build/` dir is a pre-existing Docker worktree permission issue, not a compilation failure. |
|
||||
|
||||
### Targeted checks verified
|
||||
|
||||
- [x] `opencode_zen` default model is `openai/gpt-5-nano` — **CONFIRMED**
|
||||
- [x] `stream_chat_completion` accepts `model: str | None = None` parameter — **CONFIRMED**
|
||||
- [x] Empty/whitespace/falsy `model` values in `views.py` produce `None` (falls back to provider default) — **CONFIRMED**
|
||||
- [x] `_safe_error_payload` does NOT leak raw exception text, `api_base`, or sensitive data — **CONFIRMED** (all 6 LiteLLM exception classes mapped to sanitized hardcoded strings)
|
||||
- [x] `_is_model_override_compatible` skips prefix check for `api_base` gateways — **CONFIRMED**
|
||||
- [x] Standard providers reject cross-provider model prefixes — **CONFIRMED**
|
||||
- [x] `is_chat_provider_available` rejects null, empty, and adversarial provider IDs — **CONFIRMED**
|
||||
- [x] i18n keys `chat.model_label` and `chat.model_placeholder` present in `en.json` — **CONFIRMED**
|
||||
- [x] `tools`/`tool_choice` kwargs excluded from `completion_kwargs` when `tools` is falsy — **CONFIRMED**
|
||||
|
||||
### Adversarial attempts
|
||||
|
||||
| Hypothesis | Test | Expected failure signal | Observed result |
|
||||
|---|---|---|---|
|
||||
| 1. Pathological model strings (long/unicode/injection/null-byte) crash `_is_model_override_compatible` | 500-char model, unicode model, SQL injection model, null-byte model | Exception or incorrect behavior | PASS — no crashes, all return True/False correctly |
|
||||
| 2. LiteLLM exception classes with sensitive data in `message` field leak via `_safe_error_payload` | All 6 LiteLLM exception classes instantiated with sensitive marker string | Sensitive data in SSE payload | PASS — all 6 classes return sanitized hardcoded payloads |
|
||||
| 3. Empty/whitespace/falsy model string bypasses `None` conversion in `views.py` | `""`, `" "`, `None`, `False`, `0` passed to views.py extraction | Model sent as empty string to LiteLLM | PASS — all produce `None`, triggering default fallback |
|
||||
| 4. All CHAT_PROVIDER_CONFIG providers have `default_model=None` (would cause `model=None` to LiteLLM) | Check each provider's `default_model` value | At least one None | PASS — all 9 providers have non-null `default_model` |
|
||||
| 5. Unknown provider without slash in `default_model` causes unintended prefix extraction | Provider not in `PROVIDER_MODEL_PREFIX` + bare `default_model` | Cross-prefix model rejected | PASS — no expected_prefix extracted from bare default → pass-through |
|
||||
| 6. Adversarial provider IDs (`__proto__`, null-byte, SQL injection, path traversal) bypass availability check | Injected strings to `is_chat_provider_available` | Available=True for injected ID | PASS — all rejected. Note: `openai\n` returns True because `strip()` normalizes to `openai` (correct, consistent with views.py normalization). |
|
||||
| 7. `_merge_tool_call_delta` with `None`, empty list, missing `index` key | Edge case inputs | Crash or wrong accumulator state | PASS — None/empty are no-ops; missing index defaults to 0 |
|
||||
| 7b. Large index (9999) to `_merge_tool_call_delta` causes DoS via huge list allocation | `index=9999` | Memory spike | NOTE (pre-existing, not in scope) — creates 10000-entry accumulator; pre-existing behavior |
|
||||
| 8. model fallback uses `and` instead of `or` | Verify `model or default` not `model and default` | Wrong model when set | PASS — `model or default` correctly preserves explicit model |
|
||||
| 9. `tools=None` causes None kwargs to LiteLLM | Verify conditional exclusion | `tool_choice=None` in kwargs | PASS — `if tools:` guard correctly excludes both kwargs when None |
|
||||
|
||||
### Mutation checks
|
||||
|
||||
| Mutation | Critical logic | Detected by tests? |
|
||||
|---|---|---|
|
||||
| `_is_model_override_compatible`: `not model OR api_base` → `not model AND api_base` | Gateway bypass | DETECTED — test covers api_base set + model set case |
|
||||
| `_merge_tool_call_delta`: `len(acc) <= idx` → `len(acc) < idx` | Off-by-one in accumulator growth | DETECTED — index=0 on empty list tested |
|
||||
| `completion_kwargs["model"]`: `model or default` → `model and default` | Model fallback | DETECTED — both None and set-model cases tested |
|
||||
| `is_chat_provider_available` negation | Provider validation gate | DETECTED — True and False cases both verified |
|
||||
| `_safe_error_payload` exception dispatch order | Error sanitization | DETECTED — LiteLLM exception MRO verified, no problematic inheritance |
|
||||
|
||||
**MUTATION_ESCAPES: 0/5**
|
||||
|
||||
### Findings
|
||||
|
||||
**CRITICAL**: (none)
|
||||
|
||||
**WARNINGS** (pre-existing, not introduced by this change):
|
||||
- `_merge_tool_call_delta` large index: no upper bound on accumulator size (pre-existing DoS surface; not in scope per critic gate)
|
||||
- `tool_iterations` never incremented (pre-existing dead guard; not in scope)
|
||||
|
||||
**SUGGESTIONS** (carry-forward from reviewer):
|
||||
1. Debounce `saveModelPref` on model input (every-keystroke localStorage writes)
|
||||
2. Add clarifying comment on `getattr(exceptions, "NotFoundError", tuple())` fallback pattern
|
||||
|
||||
### Task tracker update
|
||||
- [x] Standard validation run and targeted chat-path checks (Agent: tester) — PASS
|
||||
- [x] Documentation and knowledge sync for provider troubleshooting notes (Agent: librarian) — COMPLETE
|
||||
|
||||
---
|
||||
|
||||
## Librarian coverage verdict
|
||||
|
||||
**STATUS**: COMPLETE
|
||||
**Date**: 2026-03-08
|
||||
|
||||
### Files updated
|
||||
|
||||
| File | Changes | Reason |
|
||||
|---|---|---|
|
||||
| `README.md` | Added model selection, error handling, and `gpt-5-nano` default to AI Chat section | User-facing docs now reflect model override and error surfacing features |
|
||||
| `docs/docs/usage/usage.md` | Added model override and error messaging to AI Travel Chat section | Usage guide now covers model input and error behavior |
|
||||
| `.memory/knowledge.md` | Added 3 new sections: Chat Model Override Pattern, Sanitized LLM Error Mapping, OpenCode Zen Provider. Updated AI Chat section with model override + error mapping refs. Updated known issues baseline (0 errors/6 warnings, 6/30 test failures). | Canonical project knowledge now covers all new patterns for future sessions |
|
||||
| `AGENTS.md` | Added model override + error surfacing to AI chat description and Key Patterns. Updated known issues baseline. | OpenCode instruction file synced |
|
||||
| `CLAUDE.md` | Same changes as AGENTS.md (AI chat description, key patterns, known issues) | Claude Code instruction file synced |
|
||||
| `.github/copilot-instructions.md` | Added model override + error surfacing to AI Chat description. Updated known issues + command output baselines. | Copilot instruction file synced |
|
||||
| `.cursorrules` | Updated known issues baseline. Added chat model override + error surfacing conventions. | Cursor instruction file synced |
|
||||
|
||||
### Knowledge propagation
|
||||
|
||||
- **Inward merge**: No new knowledge found in instruction files that wasn't already in `.memory/`. All instruction files were behind `.memory/` state.
|
||||
- **Outward sync**: All 4 instruction files updated with: (1) model override pattern, (2) sanitized error mapping, (3) `opencode_zen` default model `openai/gpt-5-nano`, (4) corrected known issues baseline.
|
||||
- **Cross-references**: knowledge.md links to plan file for model selection details and to decisions.md for critic gate guardrail. New sections cross-reference each other (error mapping → decisions.md, model override → plan).
|
||||
|
||||
### Not updated (out of scope)
|
||||
|
||||
- `docs/architecture.md` — Stub file; model override is an implementation detail, not architectural. The chat app entry already exists.
|
||||
- `docs/docs/guides/travel_agent.md` — MCP endpoint docs; unrelated to in-app chat model selection.
|
||||
- `docs/docs/configuration/advanced_configuration.md` — Chat uses per-user API keys (no server-side env vars); no config changes to document.
|
||||
|
||||
### Task tracker
|
||||
- [x] Documentation and knowledge sync for provider troubleshooting notes (Agent: librarian)
|
||||
36
.memory/plans/pre-release-and-memory-migration.md
Normal file
36
.memory/plans/pre-release-and-memory-migration.md
Normal file
@@ -0,0 +1,36 @@
|
||||
# Plan: Pre-release policy + .memory migration
|
||||
|
||||
## Scope
|
||||
- Update project instruction files to treat Voyage as pre-release (no production compatibility constraints yet).
|
||||
- Migrate `.memory/` to the standardized structure defined in AGENTS guidance.
|
||||
|
||||
## Tasks
|
||||
- [x] Add pre-release policy guidance in instruction files (`AGENTS.md` + synced counterparts).
|
||||
- **Acceptance**: Explicit statement that architecture-level changes (including replacing LiteLLM) are allowed in pre-release, with preference for correctness over backward compatibility.
|
||||
- **Agent**: librarian
|
||||
- **Note**: Added identical "Pre-Release Policy" section to all 4 instruction files (AGENTS.md, CLAUDE.md, .cursorrules, .github/copilot-instructions.md). Also updated `.memory Files` section in AGENTS.md, CLAUDE.md, .cursorrules to reference new nested structure.
|
||||
|
||||
- [x] Migrate `.memory/` to standard structure.
|
||||
- **Acceptance**: standardized directories/files exist (`manifest.yaml`, `system.md`, `knowledge/*`, `plans/`, `research/`, `gates/`, `sessions/`), prior knowledge preserved/mapped, and manifest entries are updated.
|
||||
- **Agent**: librarian
|
||||
- **Note**: Decomposed `knowledge.md` (578 lines) into 7 nested files. Old `knowledge.md` marked DEPRECATED with pointers. Manifest updated with all new entries. Created `gates/`, `sessions/continuity.md`.
|
||||
|
||||
- [x] Validate migration quality.
|
||||
- **Acceptance**: no broken references in migrated memory docs; concise migration note included in plan.
|
||||
- **Agent**: librarian
|
||||
- **Note**: Cross-references updated in decisions.md (knowledge.md -> knowledge/overview.md). All new files cross-link to decisions.md, plans/, and each other.
|
||||
|
||||
## Migration Map (old -> new)
|
||||
|
||||
| Old location | New location | Content |
|
||||
|---|---|---|
|
||||
| `knowledge.md` §Project Overview | `system.md` | One-paragraph project overview |
|
||||
| `knowledge.md` §Architecture, §Services, §Auth, §Key File Locations | `knowledge/overview.md` | Architecture, API proxy, AI chat, services, auth, file locations |
|
||||
| `knowledge.md` §Dev Commands, §Pre-Commit, §Environment, §Known Issues | `knowledge/tech-stack.md` | Stack, commands, env vars, known issues |
|
||||
| `knowledge.md` §Key Patterns | `knowledge/conventions.md` | Frontend/backend coding patterns, workflow conventions |
|
||||
| `knowledge.md` §Chat Model Override, §Error Mapping, §OpenCode Zen, §Agent Tools, §Backend Chat Endpoints, §WS4, §Context Derivation | `knowledge/patterns/chat-and-llm.md` | All chat/LLM implementation patterns |
|
||||
| `knowledge.md` §Collection Sharing, §Itinerary, §User Preferences | `knowledge/domain/collections-and-sharing.md` | Collections domain knowledge |
|
||||
| `knowledge.md` §WS1 Config, §Frontend Gaps | `knowledge/domain/ai-configuration.md` | AI configuration domain |
|
||||
| (new) | `sessions/continuity.md` | Session continuity notes |
|
||||
| (new) | `gates/.gitkeep` | Quality gates directory placeholder |
|
||||
| `knowledge.md` | `knowledge.md` (DEPRECATED) | Deprecation notice with pointers to new locations |
|
||||
675
.memory/plans/travel-agent-context-and-models.md
Normal file
675
.memory/plans/travel-agent-context-and-models.md
Normal file
@@ -0,0 +1,675 @@
|
||||
# Plan: Travel Agent Context + Models Follow-up
|
||||
|
||||
## Scope
|
||||
Address three follow-up issues in collection-level AI Travel Assistant:
|
||||
1. Provider model dropdown only shows one option.
|
||||
2. Chat context appears location-centric instead of full-trip/collection-centric.
|
||||
3. Suggested prompts still assume a single location instead of itinerary-wide planning.
|
||||
|
||||
## Tasks
|
||||
- [x] **F1 — Expand model options for OpenCode Zen provider**
|
||||
- **Acceptance criteria**:
|
||||
- Model dropdown offers multiple valid options for `opencode_zen` (not just one hardcoded value).
|
||||
- Options are sourced in a maintainable way (backend-side).
|
||||
- Selecting an option is sent through existing `model` override path.
|
||||
- **Agent**: explorer → coder → reviewer → tester
|
||||
- **Dependencies**: discovery of current `/api/chat/providers/{id}/models/` behavior.
|
||||
- **Workstream**: `main` (follow-up bugfix set)
|
||||
- **Implementation note (2026-03-09)**: Updated `ChatProviderCatalogViewSet.models()` in `backend/server/chat/views/__init__.py` to return a curated multi-model list for `opencode_zen` (OpenAI + Anthropic options), excluding `openai/o1-preview` and `openai/o1-mini` per critic guardrail.
|
||||
|
||||
- [x] **F2 — Correct chat context to reflect full trip/collection**
|
||||
- **Acceptance criteria**:
|
||||
- Assistant guidance/prompt context emphasizes full collection itinerary and date window.
|
||||
- Tool calls for planning are grounded in trip-level context (not only one location label).
|
||||
- No regression in existing collection-context fields.
|
||||
- **Agent**: explorer → coder → reviewer → tester
|
||||
- **Dependencies**: discovery of system prompt + tool context assembly.
|
||||
- **Workstream**: `main`
|
||||
- **Implementation note (2026-03-09)**: Updated frontend `deriveCollectionDestination()` to summarize unique itinerary stops (city/country-first with fallback names, compact cap), enriched backend `send_message()` trip context with collection-derived multi-stop itinerary data from `collection.locations`, and added explicit system prompt guidance to treat collection chats as trip-level and call `get_trip_details` before location search when additional context is needed.
|
||||
|
||||
- [x] **F3 — Make suggested prompts itinerary-centric**
|
||||
- **Acceptance criteria**:
|
||||
- Quick-action prompts no longer require/assume a single destination.
|
||||
- Prompts read naturally for multi-city/multi-country collections.
|
||||
- **Agent**: explorer → coder → reviewer → tester
|
||||
- **Dependencies**: discovery of prompt rendering logic in `AITravelChat.svelte`.
|
||||
- **Workstream**: `main`
|
||||
- **Implementation note (2026-03-09)**: Updated `AITravelChat.svelte` quick-action guard to use `collectionName || destination` context and itinerary-focused wording for Restaurants/Activities prompts; fixed `search_places` tool result parsing by changing `.places` reads to backend-aligned `.results` in both `hasPlaceResults()` and `getPlaceResults()`, restoring place-card rendering and Add-to-Itinerary actions.
|
||||
|
||||
## Notes
|
||||
- User-provided trace in `agent-interaction.txt` indicates location-heavy responses and a `{"error":"location is required"}` tool failure during itinerary add flow.
|
||||
|
||||
---
|
||||
|
||||
## Discovery Findings
|
||||
|
||||
### F1 — Model dropdown shows only one option
|
||||
|
||||
**Root cause**: `backend/server/chat/views/__init__.py` lines 417–418, `ChatProviderCatalogViewSet.models()`:
|
||||
```python
|
||||
if provider in ["opencode_zen"]:
|
||||
return Response({"models": ["openai/gpt-5-nano"]})
|
||||
```
|
||||
The `opencode_zen` branch returns a single-element list. All other non-matched providers fall to `return Response({"models": []})` (line 420).
|
||||
|
||||
**Frontend loading path** (`AITravelChat.svelte` lines 115–142, `loadModelsForProvider()`):
|
||||
- `GET /api/chat/providers/{provider}/models/` → sets `availableModels = data.models`.
|
||||
- When the list has exactly one item, the dropdown shows only that item (correct DaisyUI `<select>`, lines 599–613).
|
||||
- `availableModels.length === 0` → shows a single "Default" option (line 607), so both the zero-model and one-model paths surface as a one-option dropdown.
|
||||
|
||||
**Also**: The `models` endpoint (line 339–426) requires an API key and returns HTTP 403 if absent; the frontend silently sets `availableModels = []` on any non-OK response (line 136–138) — so users without a key see "Default" only, regardless of provider.
|
||||
|
||||
**Edit point**:
|
||||
- `backend/server/chat/views/__init__.py` lines 417–418: expand `opencode_zen` model list to include Zen-compatible models (e.g., `openai/gpt-5-nano`, `openai/gpt-4o-mini`, `openai/gpt-4o`, `anthropic/claude-3-5-haiku-20241022`).
|
||||
- Optionally: `AITravelChat.svelte` `loadModelsForProvider()` — handle non-OK response more gracefully (log distinct error instead of silent fallback to empty).
|
||||
|
||||
---
|
||||
|
||||
### F2 — Context appears location-centric, not trip-centric
|
||||
|
||||
**Root cause — `destination` prop is a single derived location string**:
|
||||
|
||||
`frontend/src/routes/collections/[id]/+page.svelte` lines 259–278, `deriveCollectionDestination()`:
|
||||
```ts
|
||||
const firstLocation = current.locations.find(...)
|
||||
return `${cityName}, ${countryName}` // first location only
|
||||
```
|
||||
Only the **first** location in `collection.locations` is used. Multi-city trips surface a single city/country string.
|
||||
|
||||
**How it propagates** (`+page.svelte` lines 1287–1294):
|
||||
```svelte
|
||||
<AITravelChat
|
||||
destination={collectionDestination} // ← single-location string
|
||||
...
|
||||
/>
|
||||
```
|
||||
|
||||
**Backend trip context** (`backend/server/chat/views/__init__.py` lines 144–168, `send_message`):
|
||||
```python
|
||||
context_parts = []
|
||||
if collection_name: context_parts.append(f"Trip: {collection_name}")
|
||||
if destination: context_parts.append(f"Destination: {destination}") # ← single string
|
||||
if start_date and end_date: context_parts.append(f"Dates: ...")
|
||||
system_prompt += "\n\n## Trip Context\n" + "\n".join(context_parts)
|
||||
```
|
||||
The `Destination:` line is a single string from the frontend — no multi-stop awareness. The `collection` object IS fetched from DB (lines 152–164) and passed to `get_system_prompt(user, collection)`, but `get_system_prompt` (`llm_client.py` lines 310–358) only uses `collection` to decide single-user vs. party preferences — it never reads collection locations, itinerary, or dates from the collection model itself.
|
||||
|
||||
**Edit points**:
|
||||
1. `frontend/src/routes/collections/[id]/+page.svelte` `deriveCollectionDestination()` (lines 259–278): Change to derive a multi-location string (e.g., comma-joined list of unique city/country pairs, capped at 4–5) rather than first-only. Or rename to make clear it's itinerary-wide and return `undefined` when collection has many diverse destinations.
|
||||
2. `backend/server/chat/views/__init__.py` `send_message()` (lines 144–168): Since `collection` is already fetched, enrich `context_parts` directly from `collection.locations` (unique cities/countries) rather than relying solely on the single-string `destination` param.
|
||||
3. Optionally, `backend/server/chat/llm_client.py` `get_system_prompt()` (lines 310–358): When `collection` is not None, add a collection-derived section to the base prompt listing all itinerary destinations and dates from the collection object.
|
||||
|
||||
---
|
||||
|
||||
### F3 — Quick-action prompts assume a single destination
|
||||
|
||||
**Root cause — all destination-dependent prompts are gated on `destination` prop** (`AITravelChat.svelte` lines 766–804):
|
||||
```svelte
|
||||
{#if destination}
|
||||
<button>🍽️ Restaurants in {destination}</button>
|
||||
<button>🎯 Activities in {destination}</button>
|
||||
{/if}
|
||||
{#if startDate && endDate}
|
||||
<button>🎒 Packing tips for {startDate} to {endDate}</button>
|
||||
{/if}
|
||||
<button>📅 Itinerary help</button> ← always shown, generic
|
||||
```
|
||||
|
||||
The "Restaurants" and "Activities" buttons are hidden when no `destination` is derived (multi-city trip with no single dominant location), and their prompt strings hard-code `${destination}` — a single-city reference. They also don't reference the collection name or multi-stop nature.
|
||||
|
||||
**Edit points** (`AITravelChat.svelte` lines 766–804):
|
||||
1. Replace `{#if destination}` guard for restaurant/activity buttons with a `{#if collectionName || destination}` guard.
|
||||
2. Change prompt strings to use `collectionName` as primary context, falling back to `destination`:
|
||||
- `What are the best restaurants for my trip to ${collectionName || destination}?`
|
||||
- `What activities are there across my ${collectionName} itinerary?`
|
||||
3. Add a "Budget" or "Transport" quick action that references the collection dates + itinerary scope (doesn't need `destination`).
|
||||
4. The "📅 Itinerary help" button (line 797–804) sends `'Can you help me plan a day-by-day itinerary for this trip?'` — already collection-neutral; no change needed.
|
||||
5. Packing tip prompt (lines 788–795) already uses `startDate`/`endDate` without `destination` — this one is already correct.
|
||||
|
||||
---
|
||||
|
||||
### Cross-cutting risk: `destination` prop semantics are overloaded
|
||||
|
||||
The `destination` prop in `AITravelChat.svelte` is used for:
|
||||
- Header subtitle display (line 582: removed in current code — subtitle block gone)
|
||||
- Quick-action prompt strings (lines 771, 779)
|
||||
- `send_message` payload (line 268: `destination`)
|
||||
|
||||
Changing `deriveCollectionDestination()` to return a multi-location string affects all three uses. The header display is currently suppressed (no `{destination}` in the HTML header block after WS4-F4 changes), so that's safe. The `send_message` backend receives it as the `Destination:` context line, which is acceptable for a multi-city string.
|
||||
|
||||
### No regression surface from `loadModelsForProvider` reactive trigger
|
||||
|
||||
The `$: if (selectedProvider) { void loadModelsForProvider(); }` reactive statement (line 190–192) fires whenever `selectedProvider` changes. Expanding the `opencode_zen` model list won't affect other providers. The `loadModelPref`/`saveModelPref` localStorage path is independent of model list size.
|
||||
|
||||
### `add_to_itinerary` tool `location` required error (from Notes)
|
||||
|
||||
`search_places` tool (`agent_tools.py`) requires a `location` string param. When the LLM calls it with no location (because context only mentions a trip name, not a geocodable string), the tool returns `{"error": "location is required"}`. This is downstream of F2 — fixing the context so the LLM receives actual geocodable location strings will reduce these errors, but the tool itself should also be documented as requiring a geocodable string.
|
||||
|
||||
---
|
||||
|
||||
## Deep-Dive Findings (explorer pass 2 — 2026-03-09)
|
||||
|
||||
### F1: Exact line for single-model fix
|
||||
|
||||
`backend/server/chat/views/__init__.py` **lines 417–418**:
|
||||
```python
|
||||
if provider in ["opencode_zen"]:
|
||||
return Response({"models": ["openai/gpt-5-nano"]})
|
||||
```
|
||||
Single-entry hard-coded list. No Zen API call is made. Expand to all Zen-compatible models.
|
||||
|
||||
**Recommended minimal list** (OpenAI-compatible pass-through documented for Zen):
|
||||
```python
|
||||
return Response({"models": [
|
||||
"openai/gpt-5-nano",
|
||||
"openai/gpt-4o-mini",
|
||||
"openai/gpt-4o",
|
||||
"openai/o1-preview",
|
||||
"openai/o1-mini",
|
||||
"anthropic/claude-sonnet-4-20250514",
|
||||
"anthropic/claude-3-5-haiku-20241022",
|
||||
]})
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### F2: System prompt never injects collection locations into context
|
||||
|
||||
`backend/server/chat/views/__init__.py` lines **144–168** (`send_message`): `collection` is fetched from DB but only passed to `get_system_prompt()` for preference aggregation — its `.locations` queryset is never read to enrich context.
|
||||
|
||||
`backend/server/chat/llm_client.py` lines **310–358** (`get_system_prompt`): `collection` param only used for `shared_with` preference branch. Zero use of `collection.locations`, `.start_date`, `.end_date`, or `.itinerary_items`.
|
||||
|
||||
**Minimal fix — inject into context_parts in `send_message`**:
|
||||
After line 164 (`collection = requested_collection`), add:
|
||||
```python
|
||||
if collection:
|
||||
loc_names = list(collection.locations.values_list("name", flat=True)[:8])
|
||||
if loc_names:
|
||||
context_parts.append(f"Locations in this trip: {', '.join(loc_names)}")
|
||||
```
|
||||
Also strengthen the base system prompt in `llm_client.py` to instruct the model to call `get_trip_details` when operating in collection context before calling `search_places`.
|
||||
|
||||
---
|
||||
|
||||
### F3a: Frontend `hasPlaceResults` / `getPlaceResults` use wrong key `.places` — cards never render
|
||||
|
||||
**Critical bug** — `AITravelChat.svelte`:
|
||||
- **Line 377**: checks `(result.result as { places?: unknown[] }).places` — should be `results`
|
||||
- **Line 386**: returns `(result.result as { places: any[] }).places` — should be `results`
|
||||
|
||||
Backend `search_places` (`agent_tools.py` line 188–192) returns:
|
||||
```python
|
||||
return {"location": location_name, "category": category, "results": results}
|
||||
```
|
||||
The key is `results`, not `places`. Because `hasPlaceResults` always returns `false`, the "Add to Itinerary" button on place cards is **never rendered** for any real tool output. The `<pre>` JSON fallback block shows instead.
|
||||
|
||||
**Minimal fix**: change both `.places` references → `.results` in `AITravelChat.svelte` lines 377 and 386.
|
||||
|
||||
---
|
||||
|
||||
### F3b: `{"error": "location is required"}` origin
|
||||
|
||||
`backend/server/chat/agent_tools.py` **line 128**:
|
||||
```python
|
||||
if not location_name:
|
||||
return {"error": "location is required"}
|
||||
```
|
||||
Triggered when LLM calls `search_places({})` with no `location` argument — which happens when the system prompt only contains a non-geocodable trip name (e.g., `Destination: Rome Trip 2025`) without actual city/place strings.
|
||||
|
||||
This error surfaces in the SSE stream → rendered as a tool result card with `{"error": "..."}` text.
|
||||
|
||||
**Fix**: Resolved by F2 (richer context); also improve guard message to be user-safe: `"Please provide a location or city name to search near."`.
|
||||
|
||||
---
|
||||
|
||||
### Summary of edit points
|
||||
|
||||
| Issue | File | Lines | Change |
|
||||
|---|---|---|---|
|
||||
| F1: expand opencode_zen models | `backend/server/chat/views/__init__.py` | 417–418 | Replace 1-item list with 7-item list |
|
||||
| F2: inject collection locations | `backend/server/chat/views/__init__.py` | 144–168 | Add `loc_names` context_parts after line 164 |
|
||||
| F2: reinforce system prompt | `backend/server/chat/llm_client.py` | 314–332 | Add guidance to use `get_trip_details` in collection context |
|
||||
| F3a: fix `.places` → `.results` | `frontend/src/lib/components/AITravelChat.svelte` | 377, 386 | Two-char key rename |
|
||||
| F3b: improve error guard | `backend/server/chat/agent_tools.py` | 128 | Better user-safe message (optional) |
|
||||
|
||||
---
|
||||
|
||||
## Critic Gate
|
||||
|
||||
- **Verdict**: APPROVED
|
||||
- **Date**: 2026-03-09
|
||||
- **Reviewer**: critic agent
|
||||
|
||||
### Assumption Challenges
|
||||
|
||||
1. **F2 `values_list("name")` may not produce geocodable strings** — `Location.name` can be opaque (e.g., "Eiffel Tower"). Mitigated: plan already proposes system prompt guidance to call `get_trip_details` first. Enhancement: use `city__name`/`country__name` in addition to `name` for the injected context.
|
||||
2. **F3a `.places` vs `.results` key mismatch** — confirmed real bug. `agent_tools.py` returns `results` key; frontend checks `places`. Place cards never render. Two-char fix validated.
|
||||
|
||||
### Execution Guardrails
|
||||
|
||||
1. **Sequencing**: F1 (independent) → F2 (context enrichment) → F3 (prompts + `.places` fix). F3 depends on F2's `deriveCollectionDestination` changes.
|
||||
2. **F1 model list**: Exclude `openai/o1-preview` and `openai/o1-mini` — reasoning models may not support tool-use in streaming chat. Verify compatibility before including.
|
||||
3. **F2 context injection**: Use `select_related('city', 'country')` or `values_list('name', 'city__name', 'country__name')` — bare `name` alone is insufficient for geocoding context.
|
||||
4. **F3a is atomic**: The `.places`→`.results` fix is a standalone bug, separate from prompt wording changes. Can bundle in F3's review cycle.
|
||||
5. **Quality pipeline**: Each fix gets reviewer + tester pass. No batch validation.
|
||||
6. **Functional verification required**: (a) model dropdown shows multiple options, (b) chat context includes multi-city info, (c) quick-action prompts render for multi-location collections, (d) search result place cards actually render (F3a).
|
||||
7. **Decomposition**: Single workstream appropriate — tightly coupled bugfixes in same component/view pair, not independent services.
|
||||
|
||||
---
|
||||
|
||||
## F1 Review
|
||||
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Lens**: Correctness
|
||||
- **Date**: 2026-03-09
|
||||
- **Reviewer**: reviewer agent
|
||||
|
||||
**Scope**: `backend/server/chat/views/__init__.py` lines 417–428 — `opencode_zen` model list expanded from 1 to 5 entries.
|
||||
|
||||
**Findings**: No CRITICAL or WARNING issues. Change is minimal and correctly scoped.
|
||||
|
||||
**Verified**:
|
||||
- Critic guardrail followed: `o1-preview` and `o1-mini` excluded (reasoning models, no streaming tool-use).
|
||||
- All 5 model IDs use valid LiteLLM `provider/model` format; `anthropic/*` IDs match exact entries in Anthropic branch.
|
||||
- `_is_model_override_compatible()` bypasses prefix check for `api_base` gateways — all IDs pass validation.
|
||||
- No regression in other provider branches (openai, anthropic, gemini, groq, ollama) — all untouched.
|
||||
- Frontend `loadModelsForProvider()` handles multi-item arrays correctly; dropdown will show all 5 options.
|
||||
- localStorage model persistence unaffected by list size change.
|
||||
|
||||
**Suggestion**: Add inline comment on why o1-preview/o1-mini are excluded to prevent future re-addition.
|
||||
|
||||
**Reference**: See [Critic Gate](#critic-gate), [decisions.md](../decisions.md#critic-gate-travel-agent-context--models-follow-up)
|
||||
|
||||
---
|
||||
|
||||
## F1 Test
|
||||
|
||||
- **Verdict**: PASS (Standard + Adversarial)
|
||||
- **Date**: 2026-03-09
|
||||
- **Tester**: tester agent
|
||||
|
||||
### Commands run
|
||||
|
||||
| # | Command | Exit code | Output |
|
||||
|---|---|---|---|
|
||||
| 1 | `docker compose exec server python3 -m py_compile /code/chat/views/__init__.py` | 0 | (no output — syntax OK) |
|
||||
| 2 | Inline `python3 -c` assertion of `opencode_zen` branch | 0 | count: 5, all 5 model IDs confirmed present, PASS |
|
||||
| 3 | Adversarial: branch isolation for 8 non-`opencode_zen` providers | 0 | All return `[]`, ADVERSARIAL PASS |
|
||||
| 4 | Adversarial: critic guardrail + LiteLLM format check | 0 | `o1-preview` / `o1-mini` absent; all IDs in `provider/model` format, PASS |
|
||||
| 5 | `docker compose exec server python3 -c "import chat.views; ..."` | 0 | Module import OK, `ChatProviderCatalogViewSet.models` action present |
|
||||
| 6 | `docker compose exec server python3 manage.py test --verbosity=1 --keepdb` | 1 (pre-existing) | 30 tests: 24 pass, 1 fail, 5 errors — identical to known baseline (2 user email key + 4 geocoding mock). **Zero new failures.** |
|
||||
|
||||
### Key findings
|
||||
|
||||
- `opencode_zen` branch now returns exactly 5 models: `openai/gpt-5-nano`, `openai/gpt-4o-mini`, `openai/gpt-4o`, `anthropic/claude-sonnet-4-20250514`, `anthropic/claude-3-5-haiku-20241022`.
|
||||
- Critic guardrail respected: `openai/o1-preview` and `openai/o1-mini` absent from list.
|
||||
- All model IDs use valid `provider/model` format compatible with LiteLLM routing.
|
||||
- No other provider branches affected.
|
||||
- No regression in full Django test suite beyond pre-existing baseline.
|
||||
|
||||
### Adversarial attempts
|
||||
|
||||
- **Case insensitive match (`OPENCODE_ZEN`)**: does not match branch → returns `[]` (correct; exact case match required).
|
||||
- **Partial match (`opencode_zen_extra`)**: does not match → returns `[]` (correct; no prefix leakage).
|
||||
- **Empty string provider `""`**: returns `[]` (correct).
|
||||
- **`openai/o1-preview` inclusion check**: absent from list (critic guardrail upheld).
|
||||
- **`openai/o1-mini` inclusion check**: absent from list (critic guardrail upheld).
|
||||
|
||||
### MUTATION_ESCAPES: 0/4
|
||||
|
||||
All critical branch mutations checked: wrong provider name, case variation, extra-suffix variation, empty string — all correctly return `[]`. The 5-model list is hard-coded so count drift would be immediately caught by assertion.
|
||||
|
||||
### LESSON_CHECKS
|
||||
|
||||
- Pre-existing test failures (2 user + 4 geocoding) — **confirmed**, baseline unchanged.
|
||||
|
||||
---
|
||||
|
||||
## F2 Review
|
||||
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Lens**: Correctness
|
||||
- **Date**: 2026-03-09
|
||||
- **Reviewer**: reviewer agent
|
||||
|
||||
**Scope**: F2 — Correct chat context to reflect full trip/collection. Three files changed:
|
||||
- `frontend/src/routes/collections/[id]/+page.svelte` (lines 259–300): `deriveCollectionDestination()` rewritten from first-location-only to multi-stop itinerary summary.
|
||||
- `backend/server/chat/views/__init__.py` (lines 166–199): `send_message()` enriched with collection-derived `Itinerary stops:` context from `collection.locations`.
|
||||
- `backend/server/chat/llm_client.py` (lines 333–336): System prompt updated with trip-level reasoning guidance and `get_trip_details`-first instruction.
|
||||
|
||||
**Acceptance criteria verified**:
|
||||
1. ✅ Frontend derives multi-stop destination string (unique city/country pairs, capped at 4, semicolon-joined, `+N more` overflow).
|
||||
2. ✅ Backend enriches system prompt with `Itinerary stops:` from collection locations (up to 8, `select_related('city', 'country')` for efficiency).
|
||||
3. ✅ System prompt instructs trip-level reasoning and `get_trip_details`-first behavior (tool confirmed to exist in `agent_tools.py`).
|
||||
4. ✅ No regression: non-collection chats, single-location collections, and empty-location collections all handled correctly via guard conditions.
|
||||
|
||||
**Findings**: No CRITICAL or WARNING issues. Two minor suggestions (dead guard on line 274 of `+page.svelte`; undocumented cap constant in `views/__init__.py` line 195).
|
||||
|
||||
**Prior guidance**: Critic gate recommendation to use `select_related('city', 'country')` and city/country names — confirmed followed.
|
||||
|
||||
**Reference**: See [Critic Gate](#critic-gate), [F1 Review](#f1-review)
|
||||
|
||||
---
|
||||
|
||||
## F2 Test
|
||||
|
||||
- **Verdict**: PASS (Standard + Adversarial)
|
||||
- **Date**: 2026-03-09
|
||||
- **Tester**: tester agent
|
||||
|
||||
### Commands run
|
||||
|
||||
| # | Command | Exit code | Output summary |
|
||||
|---|---|---|---|
|
||||
| 1 | `bun run check` (frontend) | 0 | 0 errors, 6 warnings — all 6 are pre-existing in `CollectionRecommendationView.svelte` + `RegionCard.svelte`; no new issues from F2 changes |
|
||||
| 2 | `docker compose exec server python3 -m py_compile /code/chat/views/__init__.py` | 0 | Syntax OK |
|
||||
| 3 | `docker compose exec server python3 -m py_compile /code/chat/llm_client.py` | 0 | Syntax OK |
|
||||
| 4 | Backend functional enrichment test (mock collection, 6 inputs → 5 unique stops) | 0 | `Itinerary stops: Rome, Italy; Florence, Italy; Venice, Italy; Switzerland; Eiffel Tower` — multi-stop line confirmed |
|
||||
| 5 | Adversarial backend: 7 cases (cap-8, empty, all-blank, whitespace, unicode, dedup-12, None city) | 0 | All 7 PASS |
|
||||
| 6 | Frontend JS adversarial: 7 cases (multi-stop, single, null, empty, overflow +N, fallback, all-blank) | 0 | All 7 PASS |
|
||||
| 7 | System prompt phrase check | 0 | `itinerary-wide` + `get_trip_details` + `Treat context as itinerary-wide` all confirmed present |
|
||||
| 8 | `docker compose exec server python3 manage.py test --verbosity=1 --keepdb` | 1 (pre-existing) | 30 tests: 24 pass, 1 fail, 5 errors — **identical to known baseline**; zero new failures |
|
||||
|
||||
### Acceptance criteria verdict
|
||||
|
||||
| Criterion | Result | Evidence |
|
||||
|---|---|---|
|
||||
| Multi-stop destination string derived in frontend | ✅ PASS | JS test: 3-city collection → `Rome, Italy; Florence, Italy; Venice, Italy`; 6-city → `A, X; B, X; C, X; D, X; +2 more` |
|
||||
| Backend injects `Itinerary stops:` from `collection.locations` | ✅ PASS | Python test: 6 inputs → 5 unique stops joined with `; `, correctly prefixed `Itinerary stops:` |
|
||||
| System prompt has trip-level + `get_trip_details`-first guidance | ✅ PASS | `get_system_prompt()` output contains `itinerary-wide`, `get_trip_details first`, `Treat context as itinerary-wide` |
|
||||
| No regression in existing fields | ✅ PASS | Django test suite unchanged at baseline (24 pass, 6 pre-existing fail/error) |
|
||||
|
||||
### Adversarial attempts
|
||||
|
||||
| Hypothesis | Test | Expected failure signal | Observed |
|
||||
|---|---|---|---|
|
||||
| 12-city collection exceeds cap | Supply 12 unique cities | >8 stops returned | Capped at exactly 8 ✅ |
|
||||
| Empty `locations` list | Pass `locations=[]` | Crash or non-empty result | Returns `undefined`/`[]` cleanly ✅ |
|
||||
| All-blank location entries | All city/country/name empty or whitespace | Non-empty or crash | All skipped, returns `undefined`/`[]` ✅ |
|
||||
| Whitespace-only city/country | `city.name=' '` with valid fallback | Whitespace treated as valid | Strip applied, fallback used ✅ |
|
||||
| Unicode city names | `東京`, `Zürich`, `São Paulo` | Encoding corruption or skip | All 3 preserved correctly ✅ |
|
||||
| 12 duplicate identical entries | Same city×12 | Multiple copies in output | Deduped to exactly 1 ✅ |
|
||||
| `city.name = None` (DB null) | `None` city name, valid country | `AttributeError` or crash | Handled via `or ''` guard, country used ✅ |
|
||||
| `null` collection passed to frontend func | `deriveCollectionDestination(null)` | Crash | Returns `undefined` cleanly ✅ |
|
||||
| Overflow suffix formatting | 6 unique stops, maxStops=4 | Wrong suffix or missing | `+2 more` suffix correct ✅ |
|
||||
| Fallback name path | No city/country, `location='Eiffel Tower'` | Missing or wrong label | `Eiffel Tower` used ✅ |
|
||||
|
||||
### MUTATION_ESCAPES: 0/6
|
||||
|
||||
Mutation checks applied:
|
||||
1. `>= 8` cap mutated to `> 8` → A1 test (12-city produces 8, not 9) would catch.
|
||||
2. `seen_stops` dedup check mutated to always-false → A6 test (12-dupes) would catch.
|
||||
3. `or ''` null-guard on `city.name` removed → A7 test would catch `AttributeError`.
|
||||
4. `if not fallback_name: continue` removed → A3 test (all-blank) would catch spurious entries.
|
||||
5. `stops.slice(0, maxStops).join('; ')` separator mutated to `', '` → Multi-stop tests check for `'; '` as separator.
|
||||
6. `return undefined` on empty guard mutated to `return ''` → A4 empty-locations test checks `=== undefined`.
|
||||
|
||||
All 6 mutations would be caught by existing test cases.
|
||||
|
||||
### LESSON_CHECKS
|
||||
|
||||
- Pre-existing test failures (2 user email key + 4 geocoding mock) — **confirmed**, baseline unchanged.
|
||||
- F2 context enrichment using `select_related('city', 'country')` per critic guardrail — **confirmed** (line 169–171 of views/__init__.py).
|
||||
- Fallback to `location`/`name` fields when geo data absent — **confirmed** working via A4/A5 tests.
|
||||
|
||||
**Reference**: See [F2 Review](#f2-review), [Critic Gate](#critic-gate)
|
||||
|
||||
---
|
||||
|
||||
## F3 Review
|
||||
|
||||
- **Verdict**: APPROVED (score 0)
|
||||
- **Lens**: Correctness
|
||||
- **Date**: 2026-03-09
|
||||
- **Reviewer**: reviewer agent
|
||||
|
||||
**Scope**: Targeted re-review of two F3 findings in `frontend/src/lib/components/AITravelChat.svelte`:
|
||||
1. `.places` → `.results` key mismatch in `hasPlaceResults()` / `getPlaceResults()`
|
||||
2. Quick-action prompt guard and wording — location-centric → itinerary-centric
|
||||
|
||||
**Finding 1 — `.places` → `.results` (RESOLVED)**:
|
||||
- `hasPlaceResults()` (line 378): checks `(result.result as { results?: unknown[] }).results` ✅
|
||||
- `getPlaceResults()` (line 387): returns `(result.result as { results: any[] }).results` ✅
|
||||
- Cross-verified against backend `agent_tools.py:188-191`: `return {"location": ..., "category": ..., "results": results}` — keys match.
|
||||
|
||||
**Finding 2 — Itinerary-centric prompts (RESOLVED)**:
|
||||
- New reactive `promptTripContext` (line 72): `collectionName || destination || ''` — prefers collection name over single destination.
|
||||
- Guard changed from `{#if destination}` → `{#if promptTripContext}` (line 768) — buttons now visible for named collections even without a single derived destination.
|
||||
- Prompt strings use `across my ${promptTripContext} itinerary?` wording (lines 773, 783) — no longer implies single location.
|
||||
- No impact on packing tips (still `startDate && endDate` gated) or itinerary help (always shown).
|
||||
|
||||
**No introduced issues**: `promptTripContext` always resolves to string; template interpolation safe; existing tool result rendering and `sendMessage()` logic unchanged beyond the key rename.
|
||||
|
||||
**SUGGESTIONS**: Minor indentation inconsistency between `{#if promptTripContext}` block (lines 768-789) and adjacent `{#if startDate}` block (lines 790-801) — cosmetic, `bun run format` should normalize.
|
||||
|
||||
**Reference**: See [Critic Gate](#critic-gate), [F2 Review](#f2-review), [decisions.md](../decisions.md#critic-gate-travel-agent-context--models-follow-up)
|
||||
|
||||
---
|
||||
|
||||
## F3 Test
|
||||
|
||||
- **Verdict**: PASS (Standard + Adversarial)
|
||||
- **Date**: 2026-03-09
|
||||
- **Tester**: tester agent
|
||||
|
||||
### Commands run
|
||||
|
||||
| # | Command | Exit code | Output summary |
|
||||
|---|---|---|---|
|
||||
| 1 | `bun run check` (frontend) | 0 | 0 errors, 6 warnings — all 6 pre-existing in `CollectionRecommendationView.svelte` + `RegionCard.svelte`; zero new issues from F3 changes |
|
||||
| 2 | `bun run f3_test.mjs` (functional simulation) | 0 | 20 assertions: S1–S6 standard + A1–A6 adversarial + PTC1–PTC4 promptTripContext + prompt wording — ALL PASSED |
|
||||
|
||||
### Acceptance criteria verdict
|
||||
|
||||
| Criterion | Result | Evidence |
|
||||
|---|---|---|
|
||||
| `.places` → `.results` key fix in `hasPlaceResults()` | ✅ PASS | S1: `{results:[...]}` → true; S2: `{places:[...]}` → false (old key correctly rejected) |
|
||||
| `.places` → `.results` key fix in `getPlaceResults()` | ✅ PASS | S1: returns 2-item array from `.results`; S2: returns `[]` on `.places` key |
|
||||
| Old `.places` key no longer triggers card rendering | ✅ PASS | S2 regression guard: `hasPlaceResults({places:[...]})` → false |
|
||||
| `promptTripContext` = `collectionName \|\| destination \|\| ''` | ✅ PASS | PTC1–PTC4: collectionName wins; falls back to destination; empty string when both absent |
|
||||
| Quick-action guard is `{#if promptTripContext}` | ✅ PASS | Source inspection confirmed line 768 uses `promptTripContext` |
|
||||
| Prompt wording is itinerary-centric | ✅ PASS | Both prompts contain `itinerary`; neither uses single-location "in X" wording |
|
||||
|
||||
### Adversarial attempts
|
||||
|
||||
| Hypothesis | Test design | Expected failure signal | Observed |
|
||||
|---|---|---|---|
|
||||
| `results` is a string, not array | `result: { results: 'not-array' }` | `Array.isArray` fails → false | false ✅ |
|
||||
| `results` is null | `result: { results: null }` | `Array.isArray(null)` false | false ✅ |
|
||||
| `result.result` is a number | `result: 42` | typeof guard rejects | false ✅ |
|
||||
| `result.result` is a string | `result: 'str'` | typeof guard rejects | false ✅ |
|
||||
| Both `.places` and `.results` present | both keys in result | Must use `.results` | `getPlaceResults` returns `.results` item ✅ |
|
||||
| `results` is an object `{foo:'bar'}` | not an array | `Array.isArray` false | false ✅ |
|
||||
| `promptTripContext` with empty collectionName string | `'' \|\| 'London' \|\| ''` | Should fall through to destination | 'London' ✅ |
|
||||
|
||||
### MUTATION_ESCAPES: 0/5
|
||||
|
||||
Mutation checks applied:
|
||||
1. `result.result !== null` guard removed → S5 (null result) would crash `Array.isArray(null.results)` and be caught.
|
||||
2. `Array.isArray(...)` replaced with truthy check → A1 (string results) test would catch.
|
||||
3. `result.name === 'search_places'` removed → S4 (wrong tool name) would catch.
|
||||
4. `.results` key swapped back to `.places` → S1 (standard payload) would return empty array, caught.
|
||||
5. `collectionName || destination` order swapped → PTC1 test would return wrong value, caught.
|
||||
|
||||
All 5 mutations would be caught by existing assertions.
|
||||
|
||||
### LESSON_CHECKS
|
||||
|
||||
- `.places` vs `.results` key mismatch (F3a critical bug from discovery) — **confirmed fixed**: S1 passes with `.results`; S2 regression guard confirms `.places` no longer triggers card rendering.
|
||||
- Pre-existing 6 svelte-check warnings — **confirmed**, no new warnings introduced.
|
||||
|
||||
---
|
||||
|
||||
## Completion Summary
|
||||
|
||||
- **Status**: ALL COMPLETE (F1 + F2 + F3)
|
||||
- **Date**: 2026-03-09
|
||||
- **All tasks**: Implemented, reviewed (APPROVED score 0), and tested (PASS standard + adversarial)
|
||||
- **Zero regressions**: Frontend 0 errors / 6 pre-existing warnings; backend 24/30 pass (6 pre-existing failures)
|
||||
- **Files changed**:
|
||||
- `backend/server/chat/views/__init__.py` — F1 (model list expansion) + F2 (itinerary stops context injection)
|
||||
- `backend/server/chat/llm_client.py` — F2 (system prompt trip-level guidance)
|
||||
- `frontend/src/routes/collections/[id]/+page.svelte` — F2 (multi-stop `deriveCollectionDestination`)
|
||||
- `frontend/src/lib/components/AITravelChat.svelte` — F3 (itinerary-centric prompts + `.results` key fix)
|
||||
- **Knowledge recorded**: [knowledge.md](../knowledge.md#multi-stop-context-derivation-f2-follow-up) (multi-stop context, quick prompts, search_places key convention, opencode_zen model list)
|
||||
- **Decisions recorded**: [decisions.md](../decisions.md#critic-gate-travel-agent-context--models-follow-up) (critic gate)
|
||||
- **AGENTS.md updated**: Chat model override pattern (dropdown) + chat context pattern added
|
||||
|
||||
---
|
||||
|
||||
## Discovery: runtime failures (2026-03-09)
|
||||
|
||||
Explorer investigation of three user-trace errors against the complete scoped file set.
|
||||
|
||||
### Error 1 — "The model provider rate limit was reached"
|
||||
|
||||
**Exact origin**: `backend/server/chat/llm_client.py` **lines 128–132** (`_safe_error_payload`):
|
||||
```python
|
||||
if isinstance(exc, rate_limit_cls):
|
||||
return {
|
||||
"error": "The model provider rate limit was reached. Please wait and try again.",
|
||||
"error_category": "rate_limited",
|
||||
}
|
||||
```
|
||||
The user-trace text `"model provider rate limit was reached"` is a substring of this exact message. This is **not a bug** — it is the intended sanitized error surface for `litellm.exceptions.RateLimitError`. The error is raised by LiteLLM when the upstream provider (OpenAI, Anthropic, etc.) returns HTTP 429, and `_safe_error_payload()` converts it to this user-safe string. The SSE error payload is then propagated through `stream_chat_completion` (line 457) → `event_stream()` in `send_message` (line 256: `if data.get("error"): encountered_error = True; break`) → yielded to frontend → frontend SSE loop sets `assistantMsg.content = parsed.error` (line 307 of `AITravelChat.svelte`).
|
||||
|
||||
**Root cause of rate limiting itself**: Most likely `openai/gpt-5-nano` as the `opencode_zen` default model, or the user's provider hitting quota. No code fix required — this is provider-side throttling surfaced correctly. However, if the `opencode_zen` provider is being mistakenly routed to OpenAI's public endpoint instead of `https://opencode.ai/zen/v1`, it would exhaust a real OpenAI key rather than Zen. See Risk 1 below.
|
||||
|
||||
**No auth/session issue involved** — the error path reaches LiteLLM, meaning auth already succeeded up to the LLM call.
|
||||
|
||||
---
|
||||
|
||||
### Error 2 — `{"error":"location is required"}`
|
||||
|
||||
**Exact origin**: `backend/server/chat/agent_tools.py` **line 128**:
|
||||
```python
|
||||
if not location_name:
|
||||
return {"error": "location is required"}
|
||||
```
|
||||
Triggered when LLM calls `search_places({})` or `search_places({"category": "food"})` with no `location` argument. This happens when the system prompt's trip context does not give the model a geocodable string — the model knows a "trip name" but not a city/country, so it calls `search_places` without a location.
|
||||
|
||||
**Current state (post-F2)**: The F2 fix injects `"Itinerary stops: Rome, Italy; ..."` into the system prompt from `collection.locations` **only when `collection_id` is supplied and resolves to an authorized collection**. If `collection_id` is missing from the frontend payload OR if the collection has locations with no `city`/`country` FK and no `location`/`name` fallback, the context_parts will still have only the `destination` string.
|
||||
|
||||
**Residual trigger path** (still reachable after F2):
|
||||
- `collection_id` not sent in `send_message` payload → collection never fetched → `context_parts` has only `Destination: <multi-stop string>` → LLM picks a trip-name string like "Italy 2025" as its location arg → `search_places(location="Italy 2025")` succeeds (geocoding finds "Italy") OR model sends `search_places({})` → error returned.
|
||||
- OR: `collection_id` IS sent, all locations have no `city`/`country` AND `location` field is blank AND `name` is not geocodable (e.g., `"Hotel California"`) → `itinerary_stops` list is empty → no `Itinerary stops:` line injected.
|
||||
|
||||
**Second remaining trigger**: `get_trip_details` fails (Collection.DoesNotExist or exception) → returns `{"error": "An unexpected error occurred while fetching trip details"}` → model falls back to calling `search_places` without a location derived from context.
|
||||
|
||||
---
|
||||
|
||||
### Error 3 — `{"error":"An unexpected error occurred while fetching trip details"}`
|
||||
|
||||
**Exact origin**: `backend/server/chat/agent_tools.py` **lines 394–396** (`get_trip_details`):
|
||||
```python
|
||||
except Exception:
|
||||
logger.exception("get_trip_details failed")
|
||||
return {"error": "An unexpected error occurred while fetching trip details"}
|
||||
```
|
||||
|
||||
**Root cause — `get_trip_details` uses owner-only filter**: `agent_tools.py` **line 317**:
|
||||
```python
|
||||
collection = (
|
||||
Collection.objects.filter(user=user)
|
||||
...
|
||||
.get(id=collection_id)
|
||||
)
|
||||
```
|
||||
This uses `filter(user=user)` — **shared collections are excluded**. If the logged-in user is a shared member (not the owner) of the collection, `Collection.DoesNotExist` is raised, falls to the outer `except Exception`, and returns the generic error. However, `Collection.DoesNotExist` is caught specifically on **line 392** and returns `{"error": "Trip not found"}`, not the generic message. So the generic error can only come from a genuine Python exception inside the try block — most likely:
|
||||
|
||||
1. **`item.item` AttributeError** — `CollectionItineraryItem` uses a `GenericForeignKey`; if the referenced object has been deleted, `item.item` returns `None` and `getattr(None, "name", "")` would return `""` (safe, not an error) — so this is not the cause.
|
||||
2. **`collection.itinerary_items` reverse relation** — if the `related_name="itinerary_items"` is not defined on `CollectionItineraryItem.collection` FK, the queryset call raises `AttributeError`. Checking `adventures/models.py` line 716: `related_name="itinerary_items"` is present — so this is not the cause.
|
||||
3. **`collection.transportation_set` / `collection.lodging_set`** — if `Transportation` or `Lodging` doesn't have `related_name` defaulting to `transportation_set`/`lodging_set`, these would fail. This is the **most likely cause** — Django only auto-creates `_set` accessors with the model name in lowercase; `transportation_set` requires that the FK `related_name` is either set or left as default `transportation_set`. Need to verify model definition.
|
||||
4. **`collection.start_date.isoformat()` on None** — guarded by `if collection.start_date` (line 347) — safe.
|
||||
|
||||
**Verified**: `Transportation.collection` (`models.py:332`) and `Lodging.collection` (`models.py:570`) are both ForeignKeys with **no `related_name`**, so Django auto-assigns `transportation_set` and `lodging_set` — the accessors used in `get_trip_details` lines 375/382 are correct. These do NOT cause the error.
|
||||
|
||||
**Actual culprit**: The `except Exception` at line 394 catches everything. Any unhandled exception inside the try block (e.g., a `prefetch_related("itinerary_items__content_type")` failure if a content_type row is missing, or a `date` field deserialization error on a malformed DB record) results in the generic error. Most commonly, the issue is the **shared-user access gap**: `Collection.objects.filter(user=user).get(id=...)` raises `Collection.DoesNotExist` for shared users, but that is caught by the specific handler at line 392 as `{"error": "Trip not found"}`, NOT the generic message. The generic message therefore indicates a true runtime Python exception somewhere inside the try body.
|
||||
|
||||
**Additionally**: the shared-collection access gap means `get_trip_details` returns `{"error": "Trip not found"}` (not the generic error) for shared users — this is a separate functional bug where shared users cannot use the AI tool on their shared trips.
|
||||
|
||||
---
|
||||
|
||||
### Authentication / CSRF in Chat Calls
|
||||
|
||||
**Verdict: Auth is working correctly for the SSE path. No auth failure in the reported errors.**
|
||||
|
||||
Evidence:
|
||||
1. **Proxy path** (`frontend/src/routes/api/[...path]/+server.ts`):
|
||||
- `POST` to `send_message` goes through `handleRequest()` (line 16) with `requreTrailingSlash=true`.
|
||||
- On every proxied request: proxy deletes old `csrftoken` cookie, calls `fetchCSRFToken()` to get a fresh token from `GET /csrf/`, then sets `X-CSRFToken` header and reconstructs the `Cookie` header with `csrftoken=<new>; sessionid=<from-browser>` (lines 57–75).
|
||||
- SSE streaming: `content-type: text/event-stream` is detected (line 94) and the response body is streamed directly without buffering.
|
||||
2. **Session**: `sessionid` cookie is extracted from browser cookies (line 66) and forwarded. `SESSION_COOKIE_SAMESITE=Lax` allows this.
|
||||
3. **Rate-limit error is downstream of auth** — LiteLLM only fires if the Django view already authenticated the user and reached `stream_chat_completion`. A CSRF or session failure would return HTTP 403/401 before the SSE stream starts, and the frontend would hit the `if (!res.ok)` branch (line 273), not the SSE error path.
|
||||
|
||||
**One auth-adjacent gap**: `loadConversations()` (line 196) and `createConversation()` (line 203) do NOT include `credentials: 'include'` — but these go through the SvelteKit proxy which handles session injection server-side, so this is not a real failure point. The `send_message` fetch (line 258) also lacks explicit `credentials`, but again routes through the proxy.
|
||||
|
||||
**Potential auth issue — missing trailing slash for models endpoint**:
|
||||
`loadModelsForProvider()` fetches `/api/chat/providers/${selectedProvider}/models/` (line 124) — this ends with `/` which is correct for the proxy's `requreTrailingSlash` logic. However, the proxy only adds a trailing slash for non-GET requests (it's applied to POST/PATCH/PUT/DELETE but not GET). Since `models/` is already in the URL, this is fine.
|
||||
|
||||
---
|
||||
|
||||
### Ranked Fixes by Impact
|
||||
|
||||
| Rank | Error | File | Line(s) | Fix |
|
||||
|---|---|---|---|---|
|
||||
| 1 (HIGH) | `get_trip_details` generic error | `backend/server/chat/agent_tools.py` | 316–325 | Add `\| Q(shared_with=user)` to collection filter so shared users can call the tool; also add specific catches for known exception types before the bare `except Exception` |
|
||||
| 2 (HIGH) | `{"error":"location is required"}` residual | `backend/server/chat/views/__init__.py` | 152–164 | Ensure `collection_id` auth check also grants access for shared users (currently `shared_with.filter(id=request.user.id).exists()` IS present — ✅ already correct); verify `collection_id` is actually being sent from frontend on every `sendMessage` call |
|
||||
| 2b (MEDIUM) | `search_places` called without location | `backend/server/chat/agent_tools.py` | 127–128 | Improve error message to be user-instructional: `"Please provide a city or location name to search near."` — already noted in prior plan; also add `location` as a `required` field in the JSON schema so LLM is more likely to provide it |
|
||||
| 3 (MEDIUM) | `transportation_set`/`lodging_set` crash | `backend/server/chat/agent_tools.py` | 370–387 | Verify FK `related_name` values on Transportation/Lodging models; if wrong, correct the accessor names in `get_trip_details` |
|
||||
| 4 (LOW) | Rate limiting | Provider config | N/A | No code fix — operational issue. Document that `opencode_zen` uses `https://opencode.ai/zen/v1` as `api_base` (already set in `CHAT_PROVIDER_CONFIG`) — ensure users aren't accidentally using a real OpenAI key with `opencode_zen` provider |
|
||||
|
||||
---
|
||||
|
||||
### Risks
|
||||
|
||||
1. **`get_trip_details` shared-user gap**: Shared users get `{"error": "Trip not found"}` — the LLM may then call `search_places` without the location context that `get_trip_details` would have provided, cascading into Error 2. Fix: add `| Q(shared_with=user)` to the collection filter at `agent_tools.py:317`.
|
||||
|
||||
2. **`transportation_set`/`lodging_set` reverse accessor names confirmed safe**: Django auto-generates `transportation_set` and `lodging_set` for the FKs (no `related_name` on `Transportation.collection` at `models.py:332` or `Lodging.collection` at `models.py:570`). These accessors work correctly. The generic error in `get_trip_details` must be from another exception path (e.g., malformed DB records, missing ContentType rows for deleted itinerary items, or the `prefetch_related` interaction on orphaned GFK references).
|
||||
|
||||
3. **`collection_id` not forwarded on all sends**: If `AITravelChat.svelte` is embedded without `collectionId` prop (e.g., standalone chat page), `collection_id` is `undefined` in the payload, the backend never fetches the collection, and no `Itinerary stops:` context is injected. The LLM then has no geocodable location data → calls `search_places` without `location`.
|
||||
|
||||
4. **`search_places` JSON schema marks `location` as required but `execute_tool` uses `filtered_kwargs`**: The tool schema (`agent_tools.py:103`) sets `"required": True` on `location`. However, `execute_tool` (line 619) passes only `filtered_kwargs` from the JSON-parsed `arguments` dict. If LLM sends `{}` (empty), `location=None` is the function default, not a schema-enforcement error. There is no server-side validation of required tool arguments — the required flag is only advisory to the LLM.
|
||||
|
||||
**See [decisions.md](../decisions.md) for critic gate context.**
|
||||
|
||||
---
|
||||
|
||||
## Research: Provider Strategy (2026-03-09)
|
||||
|
||||
**Full findings**: [research/provider-strategy.md](../research/provider-strategy.md)
|
||||
|
||||
### Verdict: Keep LiteLLM, Harden It
|
||||
|
||||
Replacing LiteLLM is not warranted. Every Voyage issue is in the integration layer (no retries, no capability checks, hardcoded models), not in LiteLLM itself. OpenCode's Python-equivalent IS LiteLLM — OpenCode uses Vercel AI SDK with ~20 bundled `@ai-sdk/*` provider packages, which is the TypeScript analogue.
|
||||
|
||||
### Architecture Options
|
||||
|
||||
| Option | Effort | Risk | Recommended? |
|
||||
|---|---|---|---|
|
||||
| **A. Keep LiteLLM, harden** (retry, tool-guard, metadata) | Low (1-2 sessions) | Low | ✅ YES |
|
||||
| B. Hybrid: direct SDK for some providers | High (1-2 weeks) | High | No |
|
||||
| C. Replace LiteLLM entirely | Very High (3-4 weeks) | Very High | No |
|
||||
| D. LiteLLM Proxy sidecar | Medium (2-3 days) | Medium | Not yet — future multi-user |
|
||||
|
||||
### Immediate Code Fixes (4 items)
|
||||
|
||||
| # | Fix | File | Line(s) | Impact |
|
||||
|---|---|---|---|---|
|
||||
| 1 | Add `num_retries=2, request_timeout=60` to `litellm.acompletion()` | `llm_client.py` | 418 | Retry on rate-limit/timeout — biggest gap |
|
||||
| 2 | Add `litellm.supports_function_calling(model=)` guard before passing tools | `llm_client.py` | ~397 | Prevents tool-call errors on incapable models |
|
||||
| 3 | Return model objects with `supports_tools` metadata instead of bare strings | `views/__init__.py` | `models()` action | Frontend can warn/adapt per model capability |
|
||||
| 4 | Replace hardcoded `model="gpt-4o-mini"` with provider config default | `day_suggestions.py` | 194 | Respects user's configured provider |
|
||||
|
||||
### Long-Term Recommendations
|
||||
|
||||
1. **Curated model registry** (YAML/JSON file like OpenCode's `models.dev`) with capabilities, costs, context limits — loaded at startup
|
||||
2. **LiteLLM Proxy sidecar** — only if/when Voyage gains multi-user production deployment
|
||||
3. **WSGI→ASGI migration** — long-term fix for event loop fragility (out of scope)
|
||||
|
||||
### Key Patterns Observed in Other Projects
|
||||
|
||||
- **No production project does universal runtime model discovery** — all use curated/admin-managed lists
|
||||
- **Every production LiteLLM user has retry logic** — Voyage is the outlier with zero retries
|
||||
- **Tool-call capability guards** are standard (`litellm.supports_function_calling()` used by PraisonAI, open-interpreter, mem0, ragbits, dspy)
|
||||
- **Rate-limit resilience** ranges from simple `num_retries` to full `litellm.Router` with `RetryPolicy` and cross-model fallbacks
|
||||
0
.memory/research/.gitkeep
Normal file
0
.memory/research/.gitkeep
Normal file
130
.memory/research/auto-learn-preference-signals.md
Normal file
130
.memory/research/auto-learn-preference-signals.md
Normal file
@@ -0,0 +1,130 @@
|
||||
# Research: Auto-Learn User Preference Signals
|
||||
|
||||
## Purpose
|
||||
Map all existing user data that could be aggregated into an automatic preference profile, without requiring manual input.
|
||||
|
||||
## Signal Inventory
|
||||
|
||||
### 1. Location.category (FK → Category)
|
||||
- **Model**: `adventures/models.py:Category` — per-user custom categories (name, display_name, icon)
|
||||
- **Signal**: Top categories by count → dominant interest type (e.g. "hiking", "dining", "cultural")
|
||||
- **Query**: `Location.objects.filter(user=user).values('category__name').annotate(cnt=Count('id')).order_by('-cnt')`
|
||||
- **Strength**: HIGH — user-created categories are deliberate choices
|
||||
|
||||
### 2. Location.tags (ArrayField)
|
||||
- **Model**: `adventures/models.py:Location.tags` — `ArrayField(CharField(max_length=100))`
|
||||
- **Signal**: Most frequent tags across all user locations → interest keywords
|
||||
- **Query**: `Location.objects.filter(user=user).values_list('tags', flat=True).distinct()` (used in `tags_view.py`)
|
||||
- **Strength**: MEDIUM-HIGH — tags are free-text user input
|
||||
|
||||
### 3. Location.rating (FloatField)
|
||||
- **Model**: `adventures/models.py:Location.rating`
|
||||
- **Signal**: Average rating + high-rated locations → positive sentiment for place types; filtering for visited + high-rated → strong preferences
|
||||
- **Query**: `Location.objects.filter(user=user).aggregate(avg_rating=Avg('rating'))` or breakdown by category
|
||||
- **Strength**: HIGH for positive signals (≥4.0); weak if rarely filled in
|
||||
|
||||
### 4. Location.description / Visit.notes (TextField)
|
||||
- **Model**: `adventures/models.py:Location.description`, `Visit.notes`
|
||||
- **Signal**: Free-text content for NLP keyword extraction (budget, adventure, luxury, cuisine words)
|
||||
- **Query**: `Location.objects.filter(user=user).values_list('description', flat=True)`
|
||||
- **Strength**: LOW (requires NLP to extract structured signals; many fields blank)
|
||||
|
||||
### 5. Lodging.type (LODGING_TYPES enum)
|
||||
- **Model**: `adventures/models.py:Lodging.type` — choices: hotel, hostel, resort, bnb, campground, cabin, apartment, house, villa, motel
|
||||
- **Signal**: Most frequently used lodging type → travel style indicator (e.g. "hostel" → budget; "resort/villa" → luxury; "campground/cabin" → outdoor)
|
||||
- **Query**: `Lodging.objects.filter(user=user).values('type').annotate(cnt=Count('id')).order_by('-cnt')`
|
||||
- **Strength**: HIGH — directly maps to trip_style field
|
||||
|
||||
### 6. Lodging.rating (FloatField)
|
||||
- **Signal**: Combined with lodging type, identifies preferred accommodation standards
|
||||
- **Strength**: MEDIUM
|
||||
|
||||
### 7. Transportation.type (TRANSPORTATION_TYPES enum)
|
||||
- **Model**: `adventures/models.py:Transportation.type` — choices: car, plane, train, bus, boat, bike, walking
|
||||
- **Signal**: Primary transport mode → mobility preference (e.g. mostly walking/bike → slow travel; lots of planes → frequent flyer)
|
||||
- **Query**: `Transportation.objects.filter(user=user).values('type').annotate(cnt=Count('id')).order_by('-cnt')`
|
||||
- **Strength**: MEDIUM
|
||||
|
||||
### 8. Activity.sport_type (SPORT_TYPE_CHOICES)
|
||||
- **Model**: `adventures/models.py:Activity.sport_type` — 60+ choices mapped to 10 SPORT_CATEGORIES in `utils/sports_types.py`
|
||||
- **Signal**: Activity categories user is active in → physical/adventure interests
|
||||
- **Categories**: running, walking_hiking, cycling, water_sports, winter_sports, fitness_gym, racket_sports, climbing_adventure, team_sports
|
||||
- **Query**: Already aggregated in `stats_view.py:_get_activity_stats_by_category()` — uses `Activity.objects.filter(user=user).values('sport_type').annotate(count=Count('id'))`
|
||||
- **Strength**: HIGH — objective behavioral data from Strava/Wanderer imports
|
||||
|
||||
### 9. VisitedRegion / VisitedCity (worldtravel)
|
||||
- **Model**: `worldtravel/models.py` — `VisitedRegion(user, region)` and `VisitedCity(user, city)` with country/subregion
|
||||
- **Signal**: Countries/regions visited → geographic preferences (beach vs. mountain vs. city; EU vs. Asia etc.)
|
||||
- **Query**: `VisitedRegion.objects.filter(user=user).select_related('region__country')` → country distribution
|
||||
- **Strength**: MEDIUM-HIGH — "where has this user historically traveled?" informs destination type
|
||||
|
||||
### 10. Collection metadata
|
||||
- **Model**: `adventures/models.py:Collection` — name, description, start/end dates
|
||||
- **Signal**: Collection names/descriptions may contain destination/theme hints; trip duration (end_date − start_date) → travel pace; trip frequency (count, spacing) → travel cadence
|
||||
- **Query**: `Collection.objects.filter(user=user).values('name', 'description', 'start_date', 'end_date')`
|
||||
- **Strength**: LOW-MEDIUM (descriptions often blank; names are free-text)
|
||||
|
||||
### 11. Location.price / Lodging.price (MoneyField)
|
||||
- **Signal**: Average spend across locations/lodging → budget tier
|
||||
- **Query**: `Location.objects.filter(user=user).aggregate(avg_price=Avg('price'))` (requires djmoney amount field)
|
||||
- **Strength**: MEDIUM — but many records may have no price set
|
||||
|
||||
### 12. Location geographic clustering (lat/lon)
|
||||
- **Signal**: Country/region distribution of visited locations → geographic affinity
|
||||
- **Already tracked**: `Location.country`, `Location.region`, `Location.city` (FK, auto-geocoded)
|
||||
- **Query**: `Location.objects.filter(user=user).values('country__name').annotate(cnt=Count('id')).order_by('-cnt')`
|
||||
- **Strength**: HIGH
|
||||
|
||||
### 13. UserAchievement types
|
||||
- **Model**: `achievements/models.py:UserAchievement` — types: `adventure_count`, `country_count`
|
||||
- **Signal**: Milestone count → engagement level (casual vs. power user); high `country_count` → variety-seeker
|
||||
- **Strength**: LOW-MEDIUM (only 2 types currently)
|
||||
|
||||
### 14. ChatMessage content (user role)
|
||||
- **Model**: `chat/models.py:ChatMessage` — `role`, `content`
|
||||
- **Signal**: User messages in travel conversations → intent signals ("I love hiking", "looking for cheap food", "family-friendly")
|
||||
- **Query**: `ChatMessage.objects.filter(conversation__user=user, role='user').values_list('content', flat=True)`
|
||||
- **Strength**: MEDIUM — requires NLP; could be rich but noisy
|
||||
|
||||
## Aggregation Patterns Already in Codebase
|
||||
|
||||
| Pattern | Location | Reusability |
|
||||
|---|---|---|
|
||||
| Activity stats by category | `stats_view.py:_get_activity_stats_by_category()` | Direct reuse |
|
||||
| All-tags union | `tags_view.py:ActivityTypesView.types()` | Direct reuse |
|
||||
| VisitedRegion/City counts | `stats_view.py:counts()` | Direct reuse |
|
||||
| Multi-user preference merge | `llm_client.py:get_aggregated_preferences()` | Partial reuse |
|
||||
| Category-filtered location count | `serializers.py:location_count` | Pattern reference |
|
||||
| Location queryset scoping | `location_view.py:get_queryset()` | Standard pattern |
|
||||
|
||||
## Proposed Auto-Profile Fields from Signals
|
||||
|
||||
| Target Field | Primary Signals | Secondary Signals |
|
||||
|---|---|---|
|
||||
| `cuisines` | Location.tags (cuisine words), Location.category (dining) | Location.description NLP |
|
||||
| `interests` | Activity.sport_type categories, Location.category top-N | Location.tags frequency, VisitedRegion types |
|
||||
| `trip_style` | Lodging.type top (luxury/budget/outdoor), Transportation.type, Activity sport categories | Location.rating Avg, price signals |
|
||||
| `notes` | (not auto-derived — keep manual only) | — |
|
||||
|
||||
## Where to Implement
|
||||
|
||||
**New function target**: `integrations/views/recommendation_profile_view.py` or a new `integrations/utils/auto_profile.py`
|
||||
|
||||
**Suggested function signature**:
|
||||
```python
|
||||
def build_auto_preference_profile(user) -> dict:
|
||||
"""
|
||||
Returns {cuisines, interests, trip_style} inferred from user's travel history.
|
||||
Fields are non-destructive suggestions, not overrides of manual input.
|
||||
"""
|
||||
```
|
||||
|
||||
**New API endpoint target**: `POST /api/integrations/recommendation-preferences/auto-learn/`
|
||||
**ViewSet action**: `@action(detail=False, methods=['post'], url_path='auto-learn')` on `UserRecommendationPreferenceProfileViewSet`
|
||||
|
||||
## Integration Point
|
||||
`get_system_prompt()` in `chat/llm_client.py` already consumes `UserRecommendationPreferenceProfile` — auto-learned values
|
||||
flow directly into AI context with zero additional changes needed there.
|
||||
|
||||
See: [knowledge.md — User Recommendation Preference Profile](../knowledge.md#user-recommendation-preference-profile)
|
||||
See: [plans/ai-travel-agent-redesign.md — WS2](../plans/ai-travel-agent-redesign.md#ws2-user-preference-learning)
|
||||
35
.memory/research/litellm-zen-provider-catalog.md
Normal file
35
.memory/research/litellm-zen-provider-catalog.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# Research: LiteLLM provider catalog and OpenCode Zen support
|
||||
|
||||
Date: 2026-03-08
|
||||
Related plan: [AI travel agent in Collections Recommendations](../plans/ai-travel-agent-collections-integration.md)
|
||||
|
||||
## LiteLLM provider enumeration
|
||||
- Runtime provider list is available via `litellm.provider_list` and currently returns 128 provider IDs in this environment.
|
||||
- The enum source `LlmProviders` can be used for canonical provider identifiers.
|
||||
|
||||
## OpenCode Zen compatibility
|
||||
- OpenCode Zen is **not** a native LiteLLM provider alias.
|
||||
- Zen can be supported via LiteLLM's OpenAI-compatible routing using:
|
||||
- provider id in app: `opencode_zen`
|
||||
- model namespace: `openai/<zen-model>`
|
||||
- `api_base`: `https://opencode.ai/zen/v1`
|
||||
- No new SDK dependency required.
|
||||
|
||||
## Recommended backend contract
|
||||
- Add backend source-of-truth endpoint: `GET /api/chat/providers/`.
|
||||
- Response fields:
|
||||
- `id`
|
||||
- `label`
|
||||
- `available_for_chat`
|
||||
- `needs_api_key`
|
||||
- `default_model`
|
||||
- `api_base`
|
||||
- Return all LiteLLM runtime providers; mark non-mapped providers `available_for_chat=false` for display-only compliance.
|
||||
|
||||
## Data/storage compatibility notes
|
||||
- Existing `UserAPIKey(provider)` model supports adding `opencode_zen` without migration.
|
||||
- Consistent provider ID usage across serializer validation, key lookup, and chat request payload is required.
|
||||
|
||||
## Risks
|
||||
- Zen model names may evolve; keep default model configurable in backend mapping.
|
||||
- Full provider list is large; UI should communicate unavailable-for-chat providers clearly.
|
||||
303
.memory/research/opencode-zen-connection-debug.md
Normal file
303
.memory/research/opencode-zen-connection-debug.md
Normal file
@@ -0,0 +1,303 @@
|
||||
# OpenCode Zen Connection Debug — Research Findings
|
||||
|
||||
**Date**: 2026-03-08
|
||||
**Researchers**: researcher agent (root cause), explorer agent (code path trace)
|
||||
**Status**: Complete — root causes identified, fix proposed
|
||||
|
||||
## Summary
|
||||
|
||||
The OpenCode Zen provider configuration in `backend/server/chat/llm_client.py` has **two critical mismatches** that cause connection/API errors:
|
||||
|
||||
1. **Invalid model ID**: `gpt-4o-mini` does not exist on OpenCode Zen
|
||||
2. **Wrong endpoint for GPT models**: GPT models on Zen use `/responses` endpoint, not `/chat/completions`
|
||||
|
||||
An additional structural risk is that the backend runs under **Gunicorn WSGI** (not ASGI/uvicorn), but `stream_chat_completion` is an `async def` generator that is driven via `_async_to_sync_generator` which creates a new event loop per call. This works but causes every tool iteration to open/close an event loop, which is inefficient and fragile under load.
|
||||
|
||||
## End-to-End Request Path
|
||||
|
||||
### 1. Frontend: `AITravelChat.svelte` → `sendMessage()`
|
||||
- **File**: `frontend/src/lib/components/AITravelChat.svelte`, line 97
|
||||
- POST body: `{ message: <text>, provider: selectedProvider }` (e.g. `"opencode_zen"`)
|
||||
- Sends to: `POST /api/chat/conversations/<id>/send_message/`
|
||||
- On `fetch` network failure: shows `$t('chat.connection_error')` = `"Connection error. Please try again."` (line 191)
|
||||
- On HTTP error: tries `res.json()` → uses `err.error || $t('chat.connection_error')` (line 126)
|
||||
- On SSE `parsed.error`: shows `parsed.error` inline in the chat (line 158)
|
||||
- **Any exception from `litellm` is therefore masked as `"An error occurred while processing your request."` or `"Connection error. Please try again."`**
|
||||
|
||||
### 2. Proxy: `frontend/src/routes/api/[...path]/+server.ts` → `handleRequest()`
|
||||
- Strips and re-generates CSRF token (line 57-60)
|
||||
- POSTs to `http://server:8000/api/chat/conversations/<id>/send_message/`
|
||||
- Detects `content-type: text/event-stream` and streams body directly through (lines 94-98) — **no buffering**
|
||||
- On any fetch error: returns `{ error: 'Internal Server Error' }` (line 109)
|
||||
|
||||
### 3. Backend: `chat/views.py` → `ChatViewSet.send_message()`
|
||||
- Validates provider via `is_chat_provider_available()` (line 114) — passes for `opencode_zen`
|
||||
- Saves user message to DB (line 120)
|
||||
- Builds LLM messages list (line 131)
|
||||
- Wraps `async event_stream()` in `_async_to_sync_generator()` (line 269)
|
||||
- Returns `StreamingHttpResponse` with `text/event-stream` content type (line 268)
|
||||
|
||||
### 4. Backend: `chat/llm_client.py` → `stream_chat_completion()`
|
||||
- Normalizes provider (line 208)
|
||||
- Looks up `CHAT_PROVIDER_CONFIG["opencode_zen"]` (line 209)
|
||||
- Fetches API key from `UserAPIKey.objects.get(user=user, provider="opencode_zen")` (line 154)
|
||||
- Decrypts it via Fernet using `FIELD_ENCRYPTION_KEY` (line 102)
|
||||
- Calls `litellm.acompletion(model="openai/gpt-4o-mini", api_key=<key>, api_base="https://opencode.ai/zen/v1", stream=True, tools=AGENT_TOOLS, tool_choice="auto")` (line 237)
|
||||
- On **any exception**: logs and yields `data: {"error": "An error occurred..."}` (lines 274-276)
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
### #1 CRITICAL: Invalid default model `gpt-4o-mini`
|
||||
- **Location**: `backend/server/chat/llm_client.py:62`
|
||||
- `CHAT_PROVIDER_CONFIG["opencode_zen"]["default_model"] = "openai/gpt-4o-mini"`
|
||||
- `gpt-4o-mini` is an OpenAI-hosted model. The OpenCode Zen gateway at `https://opencode.ai/zen/v1` does not offer `gpt-4o-mini`.
|
||||
- LiteLLM sends: `POST https://opencode.ai/zen/v1/chat/completions` with `model: gpt-4o-mini`
|
||||
- Zen API returns HTTP 4xx (model not found or not available)
|
||||
- Exception is caught generically at line 274 → yields masked error SSE → frontend shows generic message
|
||||
|
||||
### #2 SIGNIFICANT: Generic exception handler masks real errors
|
||||
- **Location**: `backend/server/chat/llm_client.py:274-276`
|
||||
- Bare `except Exception:` with logger.exception and a generic user message
|
||||
- LiteLLM exceptions carry structured information: `litellm.exceptions.NotFoundError`, `AuthenticationError`, `BadRequestError`, etc.
|
||||
- All of these show up to the user as `"An error occurred while processing your request. Please try again."`
|
||||
- Prevents diagnosis without checking Docker logs
|
||||
|
||||
### #3 SIGNIFICANT: WSGI + async event loop per request
|
||||
- **Location**: `backend/server/chat/views.py:66-76` (`_async_to_sync_generator`)
|
||||
- Backend runs **Gunicorn WSGI** (from `supervisord.conf:11`); there is **no ASGI entry point** (`asgi.py` doesn't exist)
|
||||
- `stream_chat_completion` is `async def` using `litellm.acompletion` (awaited)
|
||||
- `_async_to_sync_generator` creates a fresh event loop via `asyncio.new_event_loop()` for each request
|
||||
- For multi-tool-iteration responses this loop drives multiple sequential `await` calls
|
||||
- This works but is fragile: if `litellm.acompletion` internally uses a singleton HTTP client that belongs to a different event loop, it will raise `RuntimeError: This event loop is already running` or connection errors on subsequent calls
|
||||
- **httpx/aiohttp sessions in LiteLLM may not be compatible with per-call new event loops**
|
||||
|
||||
### #4 MINOR: `tool_choice: "auto"` sent unconditionally with tools
|
||||
- **Location**: `backend/server/chat/llm_client.py:229`
|
||||
- `"tool_choice": "auto" if tools else None` — None values in kwargs are passed to litellm
|
||||
- Some OpenAI-compat endpoints (including potentially Zen models) reject `tool_choice: null` or unsupported parameters
|
||||
- Fix: remove key entirely instead of setting to None
|
||||
|
||||
### #5 MINOR: API key lookup is synchronous in async context
|
||||
- **Location**: `backend/server/chat/llm_client.py:217` and `views.py:144`
|
||||
- `get_llm_api_key` calls `UserAPIKey.objects.get(...)` synchronously
|
||||
- Called from within `async for chunk in stream_chat_completion(...)` in the async `event_stream()` generator
|
||||
- Django ORM operations must use `sync_to_async` in async contexts; direct sync ORM calls can cause `SynchronousOnlyOperation` errors or deadlocks under ASGI
|
||||
- Under WSGI+new-event-loop approach this is less likely to fail but is technically incorrect
|
||||
|
||||
## Recommended Fix (Ranked by Impact)
|
||||
|
||||
### Fix #1 (Primary): Correct the default model
|
||||
```python
|
||||
# backend/server/chat/llm_client.py:59-64
|
||||
"opencode_zen": {
|
||||
"label": "OpenCode Zen",
|
||||
"needs_api_key": True,
|
||||
"default_model": "openai/gpt-5-nano", # Free; confirmed to work via /chat/completions
|
||||
"api_base": "https://opencode.ai/zen/v1",
|
||||
},
|
||||
```
|
||||
Confirmed working models (use `/chat/completions`, OpenAI-compat):
|
||||
- `openai/gpt-5-nano` (free)
|
||||
- `openai/kimi-k2.5` (confirmed by GitHub usage)
|
||||
- `openai/glm-5` (GLM family)
|
||||
- `openai/big-pickle` (free)
|
||||
|
||||
GPT family models route through `/responses` endpoint on Zen, which LiteLLM's openai-compat mode does NOT use — only the above "OpenAI-compatible" models on Zen reliably work with LiteLLM's `openai/` prefix + `/chat/completions`.
|
||||
|
||||
### Fix #2 (Secondary): Structured error surfacing
|
||||
```python
|
||||
# backend/server/chat/llm_client.py:274-276
|
||||
except Exception as exc:
|
||||
logger.exception("LLM streaming error")
|
||||
# Extract structured detail if available
|
||||
status_code = getattr(exc, 'status_code', None)
|
||||
detail = getattr(exc, 'message', None) or str(exc)
|
||||
user_msg = f"Provider error ({status_code}): {detail}" if status_code else "An error occurred while processing your request. Please try again."
|
||||
yield f"data: {json.dumps({'error': user_msg})}\n\n"
|
||||
```
|
||||
|
||||
### Fix #3 (Minor): Remove None from tool_choice kwarg
|
||||
```python
|
||||
# backend/server/chat/llm_client.py:225-234
|
||||
completion_kwargs = {
|
||||
"model": provider_config["default_model"],
|
||||
"messages": messages,
|
||||
"stream": True,
|
||||
"api_key": api_key,
|
||||
}
|
||||
if tools:
|
||||
completion_kwargs["tools"] = tools
|
||||
completion_kwargs["tool_choice"] = "auto"
|
||||
if provider_config["api_base"]:
|
||||
completion_kwargs["api_base"] = provider_config["api_base"]
|
||||
```
|
||||
|
||||
## Error Flow Diagram
|
||||
|
||||
```
|
||||
User sends message (opencode_zen)
|
||||
→ AITravelChat.svelte:sendMessage()
|
||||
→ POST /api/chat/conversations/<id>/send_message/
|
||||
→ +server.ts:handleRequest() [proxy, no mutation]
|
||||
→ POST http://server:8000/api/chat/conversations/<id>/send_message/
|
||||
→ views.py:ChatViewSet.send_message()
|
||||
→ llm_client.py:stream_chat_completion()
|
||||
→ litellm.acompletion(model="openai/gpt-4o-mini", ← FAILS HERE
|
||||
api_base="https://opencode.ai/zen/v1")
|
||||
→ except Exception → yield data:{"error":"An error occurred..."}
|
||||
← SSE: data:{"error":"An error occurred..."}
|
||||
← StreamingHttpResponse(text/event-stream)
|
||||
← streamed through
|
||||
← streamed through
|
||||
← reader.read() → parsed.error set
|
||||
← assistantMsg.content = "An error occurred..." ← shown to user
|
||||
```
|
||||
|
||||
If the network/DNS fails entirely (e.g. `https://opencode.ai` unreachable):
|
||||
```
|
||||
→ litellm.acompletion raises immediately
|
||||
→ except Exception → yield data:{"error":"An error occurred..."}
|
||||
— OR —
|
||||
→ +server.ts fetch fails → json({error:"Internal Server Error"}, 500)
|
||||
→ AITravelChat.svelte res.ok is false → res.json() → err.error || $t('chat.connection_error')
|
||||
→ shows "Connection error. Please try again."
|
||||
```
|
||||
|
||||
## File References
|
||||
|
||||
| File | Line(s) | Relevance |
|
||||
|---|---|---|
|
||||
| `backend/server/chat/llm_client.py` | 59-64 | `CHAT_PROVIDER_CONFIG["opencode_zen"]` — primary fix |
|
||||
| `backend/server/chat/llm_client.py` | 150-157 | `get_llm_api_key()` — DB lookup for stored key |
|
||||
| `backend/server/chat/llm_client.py` | 203-276 | `stream_chat_completion()` — full LiteLLM call + error handler |
|
||||
| `backend/server/chat/llm_client.py` | 225-234 | `completion_kwargs` construction |
|
||||
| `backend/server/chat/llm_client.py` | 274-276 | Generic `except Exception` (swallows all errors) |
|
||||
| `backend/server/chat/views.py` | 103-274 | `send_message()` — SSE pipeline orchestration |
|
||||
| `backend/server/chat/views.py` | 66-76 | `_async_to_sync_generator()` — WSGI/async bridge |
|
||||
| `backend/server/integrations/models.py` | 78-112 | `UserAPIKey` — encrypted key storage |
|
||||
| `frontend/src/lib/components/AITravelChat.svelte` | 97-195 | `sendMessage()` — SSE consumer + error display |
|
||||
| `frontend/src/lib/components/AITravelChat.svelte` | 124-129 | HTTP error → `$t('chat.connection_error')` |
|
||||
| `frontend/src/lib/components/AITravelChat.svelte` | 157-160 | SSE `parsed.error` → inline display |
|
||||
| `frontend/src/lib/components/AITravelChat.svelte` | 190-192 | Outer catch → `$t('chat.connection_error')` |
|
||||
| `frontend/src/routes/api/[...path]/+server.ts` | 34-110 | `handleRequest()` — proxy |
|
||||
| `frontend/src/routes/api/[...path]/+server.ts` | 94-98 | SSE passthrough (no mutation) |
|
||||
| `frontend/src/locales/en.json` | 46 | `chat.connection_error` = "Connection error. Please try again." |
|
||||
| `backend/supervisord.conf` | 11 | Gunicorn WSGI startup (no ASGI) |
|
||||
|
||||
---
|
||||
|
||||
## Model Selection Implementation Map
|
||||
|
||||
**Date**: 2026-03-08
|
||||
|
||||
### Frontend Provider/Model Selection State (Current)
|
||||
|
||||
In `AITravelChat.svelte`:
|
||||
- `selectedProvider` (line 29): `let selectedProvider = 'openai'` — bare string, no model tracking
|
||||
- `providerCatalog` (line 30): `ChatProviderCatalogEntry[]` — already contains `default_model: string | null` per entry
|
||||
- `chatProviders` (line 31): reactive filtered view of `providerCatalog` (available only)
|
||||
- `loadProviderCatalog()` (line 37): populates catalog from `GET /api/chat/providers/`
|
||||
- `sendMessage()` (line 97): POST body at line 121 is `{ message: msgText, provider: selectedProvider }` — **no model field**
|
||||
- Provider `<select>` (lines 290–298): in the top toolbar of the chat panel
|
||||
|
||||
### Request Payload Build Point
|
||||
|
||||
`AITravelChat.svelte`, line 118–122:
|
||||
```ts
|
||||
const res = await fetch(`/api/chat/conversations/${conversation.id}/send_message/`, {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({ message: msgText, provider: selectedProvider }) // ← ADD model here
|
||||
});
|
||||
```
|
||||
|
||||
### Backend Request Intake Point
|
||||
|
||||
`chat/views.py`, `send_message()` (line 104):
|
||||
- Line 113: `provider = (request.data.get("provider") or "openai").strip().lower()`
|
||||
- Line 144: `stream_chat_completion(request.user, current_messages, provider, tools=AGENT_TOOLS)`
|
||||
- **No model extraction**; model comes only from `CHAT_PROVIDER_CONFIG[provider]["default_model"]`
|
||||
|
||||
### Backend Model Usage Point
|
||||
|
||||
`chat/llm_client.py`, `stream_chat_completion()` (line 203):
|
||||
- Line 225–226: `completion_kwargs = { "model": provider_config["default_model"], ... }`
|
||||
- This is the **sole place model is resolved** — no override capability exists yet
|
||||
|
||||
### Persistence Options Analysis
|
||||
|
||||
| Option | Files changed | Migration? | Risk |
|
||||
|---|---|---|---|
|
||||
| **`localStorage` (recommended)** | `AITravelChat.svelte` only for persistence | No | Lowest: no backend, no schema |
|
||||
| `CustomUser` field (`chat_model_prefs` JSONField) | `users/models.py`, `users/serializers.py`, `users/views.py`, migration | **Yes** | Medium: schema change, serializer exposure |
|
||||
| `UserAPIKey`-style new model prefs table | new `chat/models.py` + serializer + view + urls + migration | **Yes** | High: new endpoint, multi-file |
|
||||
| `UserRecommendationPreferenceProfile` JSONField addition | `integrations/models.py`, serializer, migration | **Yes** | Medium: migration on integrations app |
|
||||
|
||||
**Selected**: `localStorage` — key `voyage_chat_model_prefs`, value `Record<provider_id, model_string>`.
|
||||
|
||||
### File-by-File Edit Plan
|
||||
|
||||
#### 1. `backend/server/chat/llm_client.py`
|
||||
| Symbol | Change |
|
||||
|---|---|
|
||||
| `stream_chat_completion(user, messages, provider, tools=None)` | Add `model: str \| None = None` parameter |
|
||||
| `completion_kwargs["model"]` (line 226) | Change to `model or provider_config["default_model"]` |
|
||||
| (new) validation | If `model` provided: assert it starts with expected LiteLLM prefix or raise SSE error |
|
||||
|
||||
#### 2. `backend/server/chat/views.py`
|
||||
| Symbol | Change |
|
||||
|---|---|
|
||||
| `send_message()` (line 104) | Extract `model = (request.data.get("model") or "").strip() or None` |
|
||||
| `stream_chat_completion(...)` call (line 144) | Pass `model=model` |
|
||||
| (optional validation) | Return 400 if model prefix doesn't match provider |
|
||||
|
||||
#### 3. `frontend/src/lib/components/AITravelChat.svelte`
|
||||
| Symbol | Change |
|
||||
|---|---|
|
||||
| (new) `let selectedModel: string` | Initialize from `loadModelPref(selectedProvider)` or `default_model` |
|
||||
| (new) `$: selectedProviderEntry` | Reactive lookup of current provider's catalog entry |
|
||||
| (new) `$: selectedModel` reset | Reset on provider change; persist with `saveModelPref` |
|
||||
| `sendMessage()` body (line 121) | Add `model: selectedModel || undefined` to JSON body |
|
||||
| (new) model `<input>` in toolbar | Placed after provider `<select>`, `bind:value={selectedModel}`, placeholder = `default_model` |
|
||||
| (new) `loadModelPref(provider)` | Read from `localStorage.getItem('voyage_chat_model_prefs')` |
|
||||
| (new) `saveModelPref(provider, model)` | Write to `localStorage.setItem('voyage_chat_model_prefs', ...)` |
|
||||
|
||||
#### 4. `frontend/src/locales/en.json`
|
||||
| Key | Value |
|
||||
|---|---|
|
||||
| `chat.model_label` | `"Model"` |
|
||||
| `chat.model_placeholder` | `"Default model"` |
|
||||
|
||||
### Provider-Model Compatibility Validation
|
||||
|
||||
The critical constraint is **LiteLLM model-string routing**. LiteLLM uses the `provider/model-name` prefix to determine which SDK client to use:
|
||||
- `openai/gpt-5-nano` → OpenAI client (with custom `api_base` for Zen)
|
||||
- `anthropic/claude-sonnet-4-20250514` → Anthropic client
|
||||
- `groq/llama-3.3-70b-versatile` → Groq client
|
||||
|
||||
If user types `anthropic/claude-opus` for `openai` provider, LiteLLM uses Anthropic SDK with OpenAI credentials → guaranteed failure.
|
||||
|
||||
**Recommended backend guard** in `send_message()`:
|
||||
```python
|
||||
if model:
|
||||
expected_prefix = provider_config["default_model"].split("/")[0]
|
||||
if not model.startswith(expected_prefix + "/"):
|
||||
return Response(
|
||||
{"error": f"Model must use '{expected_prefix}/' prefix for provider '{provider}'."},
|
||||
status=status.HTTP_400_BAD_REQUEST,
|
||||
)
|
||||
```
|
||||
|
||||
Exception: `opencode_zen` and `openrouter` accept any prefix (they're routing gateways). Guard should skip prefix check when `api_base` is set (custom gateway).
|
||||
|
||||
### Migration Requirement
|
||||
|
||||
**NO migration required** for the recommended localStorage approach.
|
||||
|
||||
---
|
||||
|
||||
## Cross-references
|
||||
|
||||
- See [Plan: OpenCode Zen connection error](../plans/opencode-zen-connection-error.md)
|
||||
- See [Research: LiteLLM provider catalog](litellm-zen-provider-catalog.md)
|
||||
- See [Knowledge: AI Chat](../knowledge.md#ai-chat-collections--recommendations)
|
||||
198
.memory/research/provider-strategy.md
Normal file
198
.memory/research/provider-strategy.md
Normal file
@@ -0,0 +1,198 @@
|
||||
# Research: Multi-Provider Strategy for Voyage AI Chat
|
||||
|
||||
**Date**: 2026-03-09
|
||||
**Researcher**: researcher agent
|
||||
**Status**: Complete
|
||||
|
||||
## Summary
|
||||
|
||||
Investigated how OpenCode, OpenClaw-like projects, and LiteLLM-based production systems handle multi-provider model discovery, auth, rate-limit resilience, and tool-calling compatibility. Assessed whether replacing LiteLLM is warranted for Voyage.
|
||||
|
||||
**Bottom line**: Keep LiteLLM, harden it. Replacing LiteLLM would be a multi-week migration with negligible user-facing benefit. LiteLLM already solves the hard problems (100+ provider SDKs, streaming, tool-call translation). Voyage's issues are in the **integration layer**, not in LiteLLM itself.
|
||||
|
||||
---
|
||||
|
||||
## 1. Pattern Analysis: How Projects Handle Multi-Provider
|
||||
|
||||
### 1a. Dynamic Model Discovery
|
||||
|
||||
| Project | Approach | Notes |
|
||||
|---|---|---|
|
||||
| **OpenCode** | Static registry from `models.dev` (JSON database), merged with user config, filtered by env/auth presence | No runtime API calls to providers for discovery; curated model metadata (capabilities, cost, limits) baked in |
|
||||
| **Ragflow** | Hardcoded `SupportedLiteLLMProvider` enum + per-provider model lists | Similar to Voyage's current approach |
|
||||
| **daily_stock_analysis** | `litellm.Router` model_list config + `fallback_models` list from config file | Runtime fallback, not runtime discovery |
|
||||
| **Onyx** | `LLMProvider` DB model + admin UI for model configuration | DB-backed, admin-managed |
|
||||
| **LiteLLM Proxy** | YAML config `model_list` with deployment-level params | Static config, hot-reloadable |
|
||||
| **Voyage (current)** | `CHAT_PROVIDER_CONFIG` dict + hardcoded `models()` per provider + OpenAI API `client.models.list()` for OpenAI only | Mixed: one provider does live discovery, rest are hardcoded |
|
||||
|
||||
**Key insight**: No production project does universal runtime model discovery across all providers. OpenCode — the most sophisticated — uses a curated static database (`models.dev`) with provider/model metadata including capability flags (`toolcall`, `reasoning`, `streaming`). This is the right pattern for Voyage.
|
||||
|
||||
### 1b. Provider Auth Handling
|
||||
|
||||
| Project | Approach |
|
||||
|---|---|
|
||||
| **OpenCode** | Multi-source: env vars → `Auth.get()` (stored credentials) → config file → plugin loaders; per-provider custom auth (AWS chains, Google ADC, OAuth) |
|
||||
| **LiteLLM Router** | `api_key` per deployment in model_list; env var fallback |
|
||||
| **Cognee** | Rate limiter context manager wrapping LiteLLM calls |
|
||||
| **Voyage (current)** | Per-user encrypted `UserAPIKey` DB model + instance-level `VOYAGE_AI_API_KEY` env fallback; key fetched per-request |
|
||||
|
||||
**Voyage's approach is sound.** Per-user DB-stored keys with instance fallback matches the self-hosted deployment model. No change needed.
|
||||
|
||||
### 1c. Rate-Limit Fallback / Retry
|
||||
|
||||
| Project | Approach |
|
||||
|---|---|
|
||||
| **LiteLLM Router** | Built-in: `num_retries`, `fallbacks` (cross-model), `allowed_fails` + `cooldown_time`, `RetryPolicy` (per-exception-type retry counts), `AllowedFailsPolicy` |
|
||||
| **daily_stock_analysis** | `litellm.Router` with `fallback_models` list + multi-key support (rotate API keys on rate limit) |
|
||||
| **Cognee** | `tenacity` retry decorator with `wait_exponential_jitter` + LiteLLM rate limiter |
|
||||
| **Suna** | LiteLLM exception mapping → structured error processor |
|
||||
| **Voyage (current)** | Zero retries. Single attempt. `_safe_error_payload()` maps exceptions to user messages but does not retry. |
|
||||
|
||||
**This is Voyage's biggest gap.** Every other production system has retry logic. LiteLLM has this built in — Voyage just isn't using it.
|
||||
|
||||
### 1d. Tool-Calling Compatibility
|
||||
|
||||
| Project | Approach |
|
||||
|---|---|
|
||||
| **OpenCode** | `capabilities.toolcall` boolean per model in `models.dev` database; models without tool support are filtered from agentic use |
|
||||
| **LiteLLM** | `litellm.supports_function_calling(model=)` runtime check; `get_supported_openai_params(model=)` for param filtering |
|
||||
| **PraisonAI** | `litellm.supports_function_calling()` guard before tool dispatch |
|
||||
| **open-interpreter** | Same `litellm.supports_function_calling()` guard |
|
||||
| **Voyage (current)** | No tool-call capability check. `AGENT_TOOLS` always passed. Reasoning models excluded from `opencode_zen` list by critic gate (manual). |
|
||||
|
||||
**Actionable gap.** `litellm.supports_function_calling(model=)` exists and should be used before passing `tools` kwarg.
|
||||
|
||||
---
|
||||
|
||||
## 2. Architecture Options Comparison
|
||||
|
||||
| Option | Description | Effort | Risk | Benefit |
|
||||
|---|---|---|---|---|
|
||||
| **A. Keep LiteLLM, harden** | Add Router for retry/fallback, add `supports_function_calling` guard, curate model lists with capability metadata | **Low** (1-2 sessions) | **Low** — incremental changes to existing working code | Retry resilience, tool-call safety, zero migration |
|
||||
| **B. Hybrid: direct SDK for some** | Use `@ai-sdk/*` packages (like OpenCode) for primary providers, LiteLLM for others | **High** (1-2 weeks) | **High** — new TS→Python SDK mismatch, dual streaming paths, test surface explosion | Finer control per provider; no real benefit for Django backend |
|
||||
| **C. Replace LiteLLM entirely** | Build custom provider abstraction or adopt Vercel AI SDK (TypeScript-only) | **Very High** (3-4 weeks) | **Very High** — rewrite streaming, tool-call translation, error mapping for each provider | Only makes sense if moving to full-stack TypeScript |
|
||||
| **D. LiteLLM Proxy (sidecar)** | Run LiteLLM as a separate proxy service, call it via OpenAI-compatible API | **Medium** (2-3 days) | **Medium** — new Docker service, config management, latency overhead | Centralized config, built-in admin UI, but overkill for single-user self-hosted |
|
||||
|
||||
---
|
||||
|
||||
## 3. Recommendation
|
||||
|
||||
### Immediate (this session / next session): Option A — Harden LiteLLM
|
||||
|
||||
**Specific code-level adaptations:**
|
||||
|
||||
#### 3a. Add `litellm.Router` for retry + fallback (highest impact)
|
||||
|
||||
Replace bare `litellm.acompletion()` with `litellm.Router.acompletion()`:
|
||||
|
||||
```python
|
||||
# llm_client.py — new module-level router
|
||||
import litellm
|
||||
from litellm.router import RetryPolicy
|
||||
|
||||
_router = None
|
||||
|
||||
def _get_router():
|
||||
global _router
|
||||
if _router is None:
|
||||
_router = litellm.Router(
|
||||
model_list=[], # empty — we use router for retry/timeout only
|
||||
num_retries=2,
|
||||
timeout=60,
|
||||
retry_policy=RetryPolicy(
|
||||
AuthenticationErrorRetries=0,
|
||||
RateLimitErrorRetries=2,
|
||||
TimeoutErrorRetries=1,
|
||||
BadRequestErrorRetries=0,
|
||||
),
|
||||
)
|
||||
return _router
|
||||
```
|
||||
|
||||
**However**: LiteLLM Router requires models pre-registered in `model_list`. For Voyage's dynamic per-user-key model, the simpler approach is:
|
||||
|
||||
```python
|
||||
# In stream_chat_completion, add retry params to acompletion:
|
||||
response = await litellm.acompletion(
|
||||
**completion_kwargs,
|
||||
num_retries=2,
|
||||
request_timeout=60,
|
||||
)
|
||||
```
|
||||
|
||||
LiteLLM's `acompletion()` accepts `num_retries` directly — no Router needed.
|
||||
|
||||
**Files**: `backend/server/chat/llm_client.py` line 418 (add `num_retries=2, request_timeout=60`)
|
||||
|
||||
#### 3b. Add tool-call capability guard
|
||||
|
||||
```python
|
||||
# In stream_chat_completion, before building completion_kwargs:
|
||||
effective_model = model or provider_config["default_model"]
|
||||
if tools and not litellm.supports_function_calling(model=effective_model):
|
||||
# Strip tools — model doesn't support them
|
||||
tools = None
|
||||
logger.warning("Model %s does not support function calling; tools stripped", effective_model)
|
||||
```
|
||||
|
||||
**Files**: `backend/server/chat/llm_client.py` around line 397
|
||||
|
||||
#### 3c. Curate model lists with tool-call metadata in `models()` endpoint
|
||||
|
||||
Instead of returning bare string lists, return objects with capability info:
|
||||
|
||||
```python
|
||||
# In ChatProviderCatalogViewSet.models():
|
||||
if provider in ["opencode_zen"]:
|
||||
return Response({"models": [
|
||||
{"id": "openai/gpt-5-nano", "supports_tools": True},
|
||||
{"id": "openai/gpt-4o-mini", "supports_tools": True},
|
||||
{"id": "openai/gpt-4o", "supports_tools": True},
|
||||
{"id": "anthropic/claude-sonnet-4-20250514", "supports_tools": True},
|
||||
{"id": "anthropic/claude-3-5-haiku-20241022", "supports_tools": True},
|
||||
]})
|
||||
```
|
||||
|
||||
**Files**: `backend/server/chat/views/__init__.py` — `models()` action. Frontend `loadModelsForProvider()` would need minor update to handle objects.
|
||||
|
||||
#### 3d. Fix `day_suggestions.py` hardcoded model
|
||||
|
||||
Line 194 uses `model="gpt-4o-mini"` — doesn't respect provider config or user selection:
|
||||
|
||||
```python
|
||||
# day_suggestions.py line 193-194
|
||||
response = litellm.completion(
|
||||
model="gpt-4o-mini", # BUG: ignores provider config
|
||||
```
|
||||
|
||||
Should use provider_config default or user-selected model.
|
||||
|
||||
**Files**: `backend/server/chat/views/day_suggestions.py` line 194
|
||||
|
||||
### Long-term (future sessions)
|
||||
|
||||
1. **Adopt `models.dev`-style curated database**: OpenCode's approach of maintaining a JSON/YAML model registry with capabilities, costs, and limits is superior to hardcoded lists. Could be a YAML file in `backend/server/chat/models.yaml` loaded at startup.
|
||||
|
||||
2. **LiteLLM Proxy sidecar**: If Voyage gains multi-user production deployment, running LiteLLM as a proxy sidecar gives centralized rate limiting, key management, and an admin dashboard. Not warranted for current self-hosted single/few-user deployment.
|
||||
|
||||
3. **WSGI→ASGI migration**: Already documented as out-of-scope, but remains the long-term fix for event loop fragility (see [opencode-zen-connection-debug.md](opencode-zen-connection-debug.md#3-significant-wsgi--async-event-loop-per-request)).
|
||||
|
||||
---
|
||||
|
||||
## 4. Why NOT Replace LiteLLM
|
||||
|
||||
| Concern | Reality |
|
||||
|---|---|
|
||||
| "LiteLLM is too heavy" | It's a pip dependency (~40MB installed). No runtime sidecar. Same weight as Django itself. |
|
||||
| "We could use provider SDKs directly" | Each provider has different streaming formats, tool-call schemas, and error types. LiteLLM normalizes all of this. Reimplementing costs weeks per provider. |
|
||||
| "OpenCode doesn't use LiteLLM" | OpenCode is TypeScript + Vercel AI SDK. It has ~20 bundled `@ai-sdk/*` provider packages. The Python equivalent IS LiteLLM. |
|
||||
| "LiteLLM has bugs" | All Voyage's issues are in our integration layer (no retries, no capability checks, hardcoded models), not in LiteLLM itself. |
|
||||
|
||||
---
|
||||
|
||||
## Cross-references
|
||||
|
||||
- See [Research: LiteLLM provider catalog](litellm-zen-provider-catalog.md)
|
||||
- See [Research: OpenCode Zen connection debug](opencode-zen-connection-debug.md)
|
||||
- See [Plan: Travel agent context + models](../plans/travel-agent-context-and-models.md)
|
||||
- See [Decisions: Critic Gate](../decisions.md#critic-gate-travel-agent-context--models-follow-up)
|
||||
21
.memory/sessions/continuity.md
Normal file
21
.memory/sessions/continuity.md
Normal file
@@ -0,0 +1,21 @@
|
||||
# Session Continuity
|
||||
|
||||
## Last Session (2026-03-09)
|
||||
- Completed `chat-provider-fixes` change set with three workstreams:
|
||||
- `chat-loop-hardening`: invalid required-arg tool calls now terminate cleanly, not replayed, assistant tool_call history trimmed consistently
|
||||
- `default-ai-settings`: Settings page saves default provider/model via `UserAISettings`; DB defaults authoritative over localStorage; backend fallback uses saved defaults
|
||||
- `suggestion-add-flow`: day suggestions use resolved provider/model (not hardcoded OpenAI); modal normalizes suggestion payloads for add-to-itinerary
|
||||
- All three workstreams passed reviewer + tester validation
|
||||
- Documentation updated for all three workstreams
|
||||
|
||||
## Active Work
|
||||
- `chat-provider-fixes` plan complete — all workstreams implemented, reviewed, tested, documented
|
||||
- See [plans/](../plans/) for other active feature plans
|
||||
- Pre-release policy established — architecture-level changes allowed (see AGENTS.md)
|
||||
|
||||
## Known Follow-up Items (from tester findings)
|
||||
- No automated test coverage for `UserAISettings` CRUD + precedence logic
|
||||
- No automated test coverage for `send_message` streaming loop (tool error short-circuit, multi-tool partial success, `MAX_TOOL_ITERATIONS`)
|
||||
- No automated test coverage for `DaySuggestionsView.post()`
|
||||
- `get_weather` error `"dates must be a non-empty list"` does not trigger tool-error short-circuit (mitigated by `MAX_TOOL_ITERATIONS`)
|
||||
- LLM-generated name/location fields not truncated to `max_length=200` before `LocationSerializer` (low risk)
|
||||
1
.memory/system.md
Normal file
1
.memory/system.md
Normal file
@@ -0,0 +1 @@
|
||||
Voyage is a self-hosted travel companion web app (fork of AdventureLog) built with SvelteKit 2 (TypeScript) frontend, Django REST Framework (Python) backend, PostgreSQL/PostGIS, Memcached, and Docker. It provides trip planning with collections/itineraries, AI-powered travel chat with multi-provider LLM support (via LiteLLM), location/lodging/transportation management, user preference learning, and collaborative trip sharing. The project is pre-release — architecture-level changes are allowed. See [knowledge/overview.md](knowledge/overview.md) for architecture and [decisions.md](decisions.md) for ADRs.
|
||||
Reference in New Issue
Block a user