# OpenCode Zen Connection Debug — Research Findings **Date**: 2026-03-08 **Researchers**: researcher agent (root cause), explorer agent (code path trace) **Status**: Complete — root causes identified, fix proposed ## Summary The OpenCode Zen provider configuration in `backend/server/chat/llm_client.py` has **two critical mismatches** that cause connection/API errors: 1. **Invalid model ID**: `gpt-4o-mini` does not exist on OpenCode Zen 2. **Wrong endpoint for GPT models**: GPT models on Zen use `/responses` endpoint, not `/chat/completions` An additional structural risk is that the backend runs under **Gunicorn WSGI** (not ASGI/uvicorn), but `stream_chat_completion` is an `async def` generator that is driven via `_async_to_sync_generator` which creates a new event loop per call. This works but causes every tool iteration to open/close an event loop, which is inefficient and fragile under load. ## End-to-End Request Path ### 1. Frontend: `AITravelChat.svelte` → `sendMessage()` - **File**: `frontend/src/lib/components/AITravelChat.svelte`, line 97 - POST body: `{ message: , provider: selectedProvider }` (e.g. `"opencode_zen"`) - Sends to: `POST /api/chat/conversations//send_message/` - On `fetch` network failure: shows `$t('chat.connection_error')` = `"Connection error. Please try again."` (line 191) - On HTTP error: tries `res.json()` → uses `err.error || $t('chat.connection_error')` (line 126) - On SSE `parsed.error`: shows `parsed.error` inline in the chat (line 158) - **Any exception from `litellm` is therefore masked as `"An error occurred while processing your request."` or `"Connection error. Please try again."`** ### 2. Proxy: `frontend/src/routes/api/[...path]/+server.ts` → `handleRequest()` - Strips and re-generates CSRF token (line 57-60) - POSTs to `http://server:8000/api/chat/conversations//send_message/` - Detects `content-type: text/event-stream` and streams body directly through (lines 94-98) — **no buffering** - On any fetch error: returns `{ error: 'Internal Server Error' }` (line 109) ### 3. Backend: `chat/views.py` → `ChatViewSet.send_message()` - Validates provider via `is_chat_provider_available()` (line 114) — passes for `opencode_zen` - Saves user message to DB (line 120) - Builds LLM messages list (line 131) - Wraps `async event_stream()` in `_async_to_sync_generator()` (line 269) - Returns `StreamingHttpResponse` with `text/event-stream` content type (line 268) ### 4. Backend: `chat/llm_client.py` → `stream_chat_completion()` - Normalizes provider (line 208) - Looks up `CHAT_PROVIDER_CONFIG["opencode_zen"]` (line 209) - Fetches API key from `UserAPIKey.objects.get(user=user, provider="opencode_zen")` (line 154) - Decrypts it via Fernet using `FIELD_ENCRYPTION_KEY` (line 102) - Calls `litellm.acompletion(model="openai/gpt-4o-mini", api_key=, api_base="https://opencode.ai/zen/v1", stream=True, tools=AGENT_TOOLS, tool_choice="auto")` (line 237) - On **any exception**: logs and yields `data: {"error": "An error occurred..."}` (lines 274-276) ## Root Cause Analysis ### #1 CRITICAL: Invalid default model `gpt-4o-mini` - **Location**: `backend/server/chat/llm_client.py:62` - `CHAT_PROVIDER_CONFIG["opencode_zen"]["default_model"] = "openai/gpt-4o-mini"` - `gpt-4o-mini` is an OpenAI-hosted model. The OpenCode Zen gateway at `https://opencode.ai/zen/v1` does not offer `gpt-4o-mini`. - LiteLLM sends: `POST https://opencode.ai/zen/v1/chat/completions` with `model: gpt-4o-mini` - Zen API returns HTTP 4xx (model not found or not available) - Exception is caught generically at line 274 → yields masked error SSE → frontend shows generic message ### #2 SIGNIFICANT: Generic exception handler masks real errors - **Location**: `backend/server/chat/llm_client.py:274-276` - Bare `except Exception:` with logger.exception and a generic user message - LiteLLM exceptions carry structured information: `litellm.exceptions.NotFoundError`, `AuthenticationError`, `BadRequestError`, etc. - All of these show up to the user as `"An error occurred while processing your request. Please try again."` - Prevents diagnosis without checking Docker logs ### #3 SIGNIFICANT: WSGI + async event loop per request - **Location**: `backend/server/chat/views.py:66-76` (`_async_to_sync_generator`) - Backend runs **Gunicorn WSGI** (from `supervisord.conf:11`); there is **no ASGI entry point** (`asgi.py` doesn't exist) - `stream_chat_completion` is `async def` using `litellm.acompletion` (awaited) - `_async_to_sync_generator` creates a fresh event loop via `asyncio.new_event_loop()` for each request - For multi-tool-iteration responses this loop drives multiple sequential `await` calls - This works but is fragile: if `litellm.acompletion` internally uses a singleton HTTP client that belongs to a different event loop, it will raise `RuntimeError: This event loop is already running` or connection errors on subsequent calls - **httpx/aiohttp sessions in LiteLLM may not be compatible with per-call new event loops** ### #4 MINOR: `tool_choice: "auto"` sent unconditionally with tools - **Location**: `backend/server/chat/llm_client.py:229` - `"tool_choice": "auto" if tools else None` — None values in kwargs are passed to litellm - Some OpenAI-compat endpoints (including potentially Zen models) reject `tool_choice: null` or unsupported parameters - Fix: remove key entirely instead of setting to None ### #5 MINOR: API key lookup is synchronous in async context - **Location**: `backend/server/chat/llm_client.py:217` and `views.py:144` - `get_llm_api_key` calls `UserAPIKey.objects.get(...)` synchronously - Called from within `async for chunk in stream_chat_completion(...)` in the async `event_stream()` generator - Django ORM operations must use `sync_to_async` in async contexts; direct sync ORM calls can cause `SynchronousOnlyOperation` errors or deadlocks under ASGI - Under WSGI+new-event-loop approach this is less likely to fail but is technically incorrect ## Recommended Fix (Ranked by Impact) ### Fix #1 (Primary): Correct the default model ```python # backend/server/chat/llm_client.py:59-64 "opencode_zen": { "label": "OpenCode Zen", "needs_api_key": True, "default_model": "openai/gpt-5-nano", # Free; confirmed to work via /chat/completions "api_base": "https://opencode.ai/zen/v1", }, ``` Confirmed working models (use `/chat/completions`, OpenAI-compat): - `openai/gpt-5-nano` (free) - `openai/kimi-k2.5` (confirmed by GitHub usage) - `openai/glm-5` (GLM family) - `openai/big-pickle` (free) GPT family models route through `/responses` endpoint on Zen, which LiteLLM's openai-compat mode does NOT use — only the above "OpenAI-compatible" models on Zen reliably work with LiteLLM's `openai/` prefix + `/chat/completions`. ### Fix #2 (Secondary): Structured error surfacing ```python # backend/server/chat/llm_client.py:274-276 except Exception as exc: logger.exception("LLM streaming error") # Extract structured detail if available status_code = getattr(exc, 'status_code', None) detail = getattr(exc, 'message', None) or str(exc) user_msg = f"Provider error ({status_code}): {detail}" if status_code else "An error occurred while processing your request. Please try again." yield f"data: {json.dumps({'error': user_msg})}\n\n" ``` ### Fix #3 (Minor): Remove None from tool_choice kwarg ```python # backend/server/chat/llm_client.py:225-234 completion_kwargs = { "model": provider_config["default_model"], "messages": messages, "stream": True, "api_key": api_key, } if tools: completion_kwargs["tools"] = tools completion_kwargs["tool_choice"] = "auto" if provider_config["api_base"]: completion_kwargs["api_base"] = provider_config["api_base"] ``` ## Error Flow Diagram ``` User sends message (opencode_zen) → AITravelChat.svelte:sendMessage() → POST /api/chat/conversations//send_message/ → +server.ts:handleRequest() [proxy, no mutation] → POST http://server:8000/api/chat/conversations//send_message/ → views.py:ChatViewSet.send_message() → llm_client.py:stream_chat_completion() → litellm.acompletion(model="openai/gpt-4o-mini", ← FAILS HERE api_base="https://opencode.ai/zen/v1") → except Exception → yield data:{"error":"An error occurred..."} ← SSE: data:{"error":"An error occurred..."} ← StreamingHttpResponse(text/event-stream) ← streamed through ← streamed through ← reader.read() → parsed.error set ← assistantMsg.content = "An error occurred..." ← shown to user ``` If the network/DNS fails entirely (e.g. `https://opencode.ai` unreachable): ``` → litellm.acompletion raises immediately → except Exception → yield data:{"error":"An error occurred..."} — OR — → +server.ts fetch fails → json({error:"Internal Server Error"}, 500) → AITravelChat.svelte res.ok is false → res.json() → err.error || $t('chat.connection_error') → shows "Connection error. Please try again." ``` ## File References | File | Line(s) | Relevance | |---|---|---| | `backend/server/chat/llm_client.py` | 59-64 | `CHAT_PROVIDER_CONFIG["opencode_zen"]` — primary fix | | `backend/server/chat/llm_client.py` | 150-157 | `get_llm_api_key()` — DB lookup for stored key | | `backend/server/chat/llm_client.py` | 203-276 | `stream_chat_completion()` — full LiteLLM call + error handler | | `backend/server/chat/llm_client.py` | 225-234 | `completion_kwargs` construction | | `backend/server/chat/llm_client.py` | 274-276 | Generic `except Exception` (swallows all errors) | | `backend/server/chat/views.py` | 103-274 | `send_message()` — SSE pipeline orchestration | | `backend/server/chat/views.py` | 66-76 | `_async_to_sync_generator()` — WSGI/async bridge | | `backend/server/integrations/models.py` | 78-112 | `UserAPIKey` — encrypted key storage | | `frontend/src/lib/components/AITravelChat.svelte` | 97-195 | `sendMessage()` — SSE consumer + error display | | `frontend/src/lib/components/AITravelChat.svelte` | 124-129 | HTTP error → `$t('chat.connection_error')` | | `frontend/src/lib/components/AITravelChat.svelte` | 157-160 | SSE `parsed.error` → inline display | | `frontend/src/lib/components/AITravelChat.svelte` | 190-192 | Outer catch → `$t('chat.connection_error')` | | `frontend/src/routes/api/[...path]/+server.ts` | 34-110 | `handleRequest()` — proxy | | `frontend/src/routes/api/[...path]/+server.ts` | 94-98 | SSE passthrough (no mutation) | | `frontend/src/locales/en.json` | 46 | `chat.connection_error` = "Connection error. Please try again." | | `backend/supervisord.conf` | 11 | Gunicorn WSGI startup (no ASGI) | --- ## Model Selection Implementation Map **Date**: 2026-03-08 ### Frontend Provider/Model Selection State (Current) In `AITravelChat.svelte`: - `selectedProvider` (line 29): `let selectedProvider = 'openai'` — bare string, no model tracking - `providerCatalog` (line 30): `ChatProviderCatalogEntry[]` — already contains `default_model: string | null` per entry - `chatProviders` (line 31): reactive filtered view of `providerCatalog` (available only) - `loadProviderCatalog()` (line 37): populates catalog from `GET /api/chat/providers/` - `sendMessage()` (line 97): POST body at line 121 is `{ message: msgText, provider: selectedProvider }` — **no model field** - Provider `` in toolbar | Placed after provider `