16 KiB
title, type, permalink
| title | type | permalink |
|---|---|---|
| opencode-zen-connection-debug | note | voyage/research/opencode-zen-connection-debug |
OpenCode Zen Connection Debug — Research Findings
Date: 2026-03-08 Researchers: researcher agent (root cause), explorer agent (code path trace) Status: Complete — root causes identified, fix proposed
Summary
The OpenCode Zen provider configuration in backend/server/chat/llm_client.py has two critical mismatches that cause connection/API errors:
- Invalid model ID:
gpt-4o-minidoes not exist on OpenCode Zen - Wrong endpoint for GPT models: GPT models on Zen use
/responsesendpoint, not/chat/completions
An additional structural risk is that the backend runs under Gunicorn WSGI (not ASGI/uvicorn), but stream_chat_completion is an async def generator that is driven via _async_to_sync_generator which creates a new event loop per call. This works but causes every tool iteration to open/close an event loop, which is inefficient and fragile under load.
End-to-End Request Path
1. Frontend: AITravelChat.svelte → sendMessage()
- File:
frontend/src/lib/components/AITravelChat.svelte, line 97 - POST body:
{ message: <text>, provider: selectedProvider }(e.g."opencode_zen") - Sends to:
POST /api/chat/conversations/<id>/send_message/ - On
fetchnetwork failure: shows$t('chat.connection_error')="Connection error. Please try again."(line 191) - On HTTP error: tries
res.json()→ useserr.error || $t('chat.connection_error')(line 126) - On SSE
parsed.error: showsparsed.errorinline in the chat (line 158) - Any exception from
litellmis therefore masked as"An error occurred while processing your request."or"Connection error. Please try again."
2. Proxy: frontend/src/routes/api/[...path]/+server.ts → handleRequest()
- Strips and re-generates CSRF token (line 57-60)
- POSTs to
http://server:8000/api/chat/conversations/<id>/send_message/ - Detects
content-type: text/event-streamand streams body directly through (lines 94-98) — no buffering - On any fetch error: returns
{ error: 'Internal Server Error' }(line 109)
3. Backend: chat/views.py → ChatViewSet.send_message()
- Validates provider via
is_chat_provider_available()(line 114) — passes foropencode_zen - Saves user message to DB (line 120)
- Builds LLM messages list (line 131)
- Wraps
async event_stream()in_async_to_sync_generator()(line 269) - Returns
StreamingHttpResponsewithtext/event-streamcontent type (line 268)
4. Backend: chat/llm_client.py → stream_chat_completion()
- Normalizes provider (line 208)
- Looks up
CHAT_PROVIDER_CONFIG["opencode_zen"](line 209) - Fetches API key from
UserAPIKey.objects.get(user=user, provider="opencode_zen")(line 154) - Decrypts it via Fernet using
FIELD_ENCRYPTION_KEY(line 102) - Calls
litellm.acompletion(model="openai/gpt-4o-mini", api_key=<key>, api_base="https://opencode.ai/zen/v1", stream=True, tools=AGENT_TOOLS, tool_choice="auto")(line 237) - On any exception: logs and yields
data: {"error": "An error occurred..."}(lines 274-276)
Root Cause Analysis
#1 CRITICAL: Invalid default model gpt-4o-mini
- Location:
backend/server/chat/llm_client.py:62 CHAT_PROVIDER_CONFIG["opencode_zen"]["default_model"] = "openai/gpt-4o-mini"gpt-4o-miniis an OpenAI-hosted model. The OpenCode Zen gateway athttps://opencode.ai/zen/v1does not offergpt-4o-mini.- LiteLLM sends:
POST https://opencode.ai/zen/v1/chat/completionswithmodel: gpt-4o-mini - Zen API returns HTTP 4xx (model not found or not available)
- Exception is caught generically at line 274 → yields masked error SSE → frontend shows generic message
#2 SIGNIFICANT: Generic exception handler masks real errors
- Location:
backend/server/chat/llm_client.py:274-276 - Bare
except Exception:with logger.exception and a generic user message - LiteLLM exceptions carry structured information:
litellm.exceptions.NotFoundError,AuthenticationError,BadRequestError, etc. - All of these show up to the user as
"An error occurred while processing your request. Please try again." - Prevents diagnosis without checking Docker logs
#3 SIGNIFICANT: WSGI + async event loop per request
- Location:
backend/server/chat/views.py:66-76(_async_to_sync_generator) - Backend runs Gunicorn WSGI (from
supervisord.conf:11); there is no ASGI entry point (asgi.pydoesn't exist) stream_chat_completionisasync defusinglitellm.acompletion(awaited)_async_to_sync_generatorcreates a fresh event loop viaasyncio.new_event_loop()for each request- For multi-tool-iteration responses this loop drives multiple sequential
awaitcalls - This works but is fragile: if
litellm.acompletioninternally uses a singleton HTTP client that belongs to a different event loop, it will raiseRuntimeError: This event loop is already runningor connection errors on subsequent calls - httpx/aiohttp sessions in LiteLLM may not be compatible with per-call new event loops
#4 MINOR: tool_choice: "auto" sent unconditionally with tools
- Location:
backend/server/chat/llm_client.py:229 "tool_choice": "auto" if tools else None— None values in kwargs are passed to litellm- Some OpenAI-compat endpoints (including potentially Zen models) reject
tool_choice: nullor unsupported parameters - Fix: remove key entirely instead of setting to None
#5 MINOR: API key lookup is synchronous in async context
- Location:
backend/server/chat/llm_client.py:217andviews.py:144 get_llm_api_keycallsUserAPIKey.objects.get(...)synchronously- Called from within
async for chunk in stream_chat_completion(...)in the asyncevent_stream()generator - Django ORM operations must use
sync_to_asyncin async contexts; direct sync ORM calls can causeSynchronousOnlyOperationerrors or deadlocks under ASGI - Under WSGI+new-event-loop approach this is less likely to fail but is technically incorrect
Recommended Fix (Ranked by Impact)
Fix #1 (Primary): Correct the default model
# backend/server/chat/llm_client.py:59-64
"opencode_zen": {
"label": "OpenCode Zen",
"needs_api_key": True,
"default_model": "openai/gpt-5-nano", # Free; confirmed to work via /chat/completions
"api_base": "https://opencode.ai/zen/v1",
},
Confirmed working models (use /chat/completions, OpenAI-compat):
openai/gpt-5-nano(free)openai/kimi-k2.5(confirmed by GitHub usage)openai/glm-5(GLM family)openai/big-pickle(free)
GPT family models route through /responses endpoint on Zen, which LiteLLM's openai-compat mode does NOT use — only the above "OpenAI-compatible" models on Zen reliably work with LiteLLM's openai/ prefix + /chat/completions.
Fix #2 (Secondary): Structured error surfacing
# backend/server/chat/llm_client.py:274-276
except Exception as exc:
logger.exception("LLM streaming error")
# Extract structured detail if available
status_code = getattr(exc, 'status_code', None)
detail = getattr(exc, 'message', None) or str(exc)
user_msg = f"Provider error ({status_code}): {detail}" if status_code else "An error occurred while processing your request. Please try again."
yield f"data: {json.dumps({'error': user_msg})}\n\n"
Fix #3 (Minor): Remove None from tool_choice kwarg
# backend/server/chat/llm_client.py:225-234
completion_kwargs = {
"model": provider_config["default_model"],
"messages": messages,
"stream": True,
"api_key": api_key,
}
if tools:
completion_kwargs["tools"] = tools
completion_kwargs["tool_choice"] = "auto"
if provider_config["api_base"]:
completion_kwargs["api_base"] = provider_config["api_base"]
Error Flow Diagram
User sends message (opencode_zen)
→ AITravelChat.svelte:sendMessage()
→ POST /api/chat/conversations/<id>/send_message/
→ +server.ts:handleRequest() [proxy, no mutation]
→ POST http://server:8000/api/chat/conversations/<id>/send_message/
→ views.py:ChatViewSet.send_message()
→ llm_client.py:stream_chat_completion()
→ litellm.acompletion(model="openai/gpt-4o-mini", ← FAILS HERE
api_base="https://opencode.ai/zen/v1")
→ except Exception → yield data:{"error":"An error occurred..."}
← SSE: data:{"error":"An error occurred..."}
← StreamingHttpResponse(text/event-stream)
← streamed through
← streamed through
← reader.read() → parsed.error set
← assistantMsg.content = "An error occurred..." ← shown to user
If the network/DNS fails entirely (e.g. https://opencode.ai unreachable):
→ litellm.acompletion raises immediately
→ except Exception → yield data:{"error":"An error occurred..."}
— OR —
→ +server.ts fetch fails → json({error:"Internal Server Error"}, 500)
→ AITravelChat.svelte res.ok is false → res.json() → err.error || $t('chat.connection_error')
→ shows "Connection error. Please try again."
File References
| File | Line(s) | Relevance |
|---|---|---|
backend/server/chat/llm_client.py |
59-64 | CHAT_PROVIDER_CONFIG["opencode_zen"] — primary fix |
backend/server/chat/llm_client.py |
150-157 | get_llm_api_key() — DB lookup for stored key |
backend/server/chat/llm_client.py |
203-276 | stream_chat_completion() — full LiteLLM call + error handler |
backend/server/chat/llm_client.py |
225-234 | completion_kwargs construction |
backend/server/chat/llm_client.py |
274-276 | Generic except Exception (swallows all errors) |
backend/server/chat/views.py |
103-274 | send_message() — SSE pipeline orchestration |
backend/server/chat/views.py |
66-76 | _async_to_sync_generator() — WSGI/async bridge |
backend/server/integrations/models.py |
78-112 | UserAPIKey — encrypted key storage |
frontend/src/lib/components/AITravelChat.svelte |
97-195 | sendMessage() — SSE consumer + error display |
frontend/src/lib/components/AITravelChat.svelte |
124-129 | HTTP error → $t('chat.connection_error') |
frontend/src/lib/components/AITravelChat.svelte |
157-160 | SSE parsed.error → inline display |
frontend/src/lib/components/AITravelChat.svelte |
190-192 | Outer catch → $t('chat.connection_error') |
frontend/src/routes/api/[...path]/+server.ts |
34-110 | handleRequest() — proxy |
frontend/src/routes/api/[...path]/+server.ts |
94-98 | SSE passthrough (no mutation) |
frontend/src/locales/en.json |
46 | chat.connection_error = "Connection error. Please try again." |
backend/supervisord.conf |
11 | Gunicorn WSGI startup (no ASGI) |
Model Selection Implementation Map
Date: 2026-03-08
Frontend Provider/Model Selection State (Current)
In AITravelChat.svelte:
selectedProvider(line 29):let selectedProvider = 'openai'— bare string, no model trackingproviderCatalog(line 30):ChatProviderCatalogEntry[]— already containsdefault_model: string | nullper entrychatProviders(line 31): reactive filtered view ofproviderCatalog(available only)loadProviderCatalog()(line 37): populates catalog fromGET /api/chat/providers/sendMessage()(line 97): POST body at line 121 is{ message: msgText, provider: selectedProvider }— no model field- Provider
<select>(lines 290–298): in the top toolbar of the chat panel
Request Payload Build Point
AITravelChat.svelte, line 118–122:
const res = await fetch(`/api/chat/conversations/${conversation.id}/send_message/`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message: msgText, provider: selectedProvider }) // ← ADD model here
});
Backend Request Intake Point
chat/views.py, send_message() (line 104):
- Line 113:
provider = (request.data.get("provider") or "openai").strip().lower() - Line 144:
stream_chat_completion(request.user, current_messages, provider, tools=AGENT_TOOLS) - No model extraction; model comes only from
CHAT_PROVIDER_CONFIG[provider]["default_model"]
Backend Model Usage Point
chat/llm_client.py, stream_chat_completion() (line 203):
- Line 225–226:
completion_kwargs = { "model": provider_config["default_model"], ... } - This is the sole place model is resolved — no override capability exists yet
Persistence Options Analysis
| Option | Files changed | Migration? | Risk |
|---|---|---|---|
localStorage (recommended) |
AITravelChat.svelte only for persistence |
No | Lowest: no backend, no schema |
CustomUser field (chat_model_prefs JSONField) |
users/models.py, users/serializers.py, users/views.py, migration |
Yes | Medium: schema change, serializer exposure |
UserAPIKey-style new model prefs table |
new chat/models.py + serializer + view + urls + migration |
Yes | High: new endpoint, multi-file |
UserRecommendationPreferenceProfile JSONField addition |
integrations/models.py, serializer, migration |
Yes | Medium: migration on integrations app |
Selected: localStorage — key voyage_chat_model_prefs, value Record<provider_id, model_string>.
File-by-File Edit Plan
1. backend/server/chat/llm_client.py
| Symbol | Change |
|---|---|
stream_chat_completion(user, messages, provider, tools=None) |
Add model: str | None = None parameter |
completion_kwargs["model"] (line 226) |
Change to model or provider_config["default_model"] |
| (new) validation | If model provided: assert it starts with expected LiteLLM prefix or raise SSE error |
2. backend/server/chat/views.py
| Symbol | Change |
|---|---|
send_message() (line 104) |
Extract model = (request.data.get("model") or "").strip() or None |
stream_chat_completion(...) call (line 144) |
Pass model=model |
| (optional validation) | Return 400 if model prefix doesn't match provider |
3. frontend/src/lib/components/AITravelChat.svelte
| Symbol | Change |
|---|---|
(new) let selectedModel: string |
Initialize from loadModelPref(selectedProvider) or default_model |
(new) $: selectedProviderEntry |
Reactive lookup of current provider's catalog entry |
(new) $: selectedModel reset |
Reset on provider change; persist with saveModelPref |
sendMessage() body (line 121) |
Add `model: selectedModel |
(new) model <input> in toolbar |
Placed after provider <select>, bind:value={selectedModel}, placeholder = default_model |
(new) loadModelPref(provider) |
Read from localStorage.getItem('voyage_chat_model_prefs') |
(new) saveModelPref(provider, model) |
Write to localStorage.setItem('voyage_chat_model_prefs', ...) |
4. frontend/src/locales/en.json
| Key | Value |
|---|---|
chat.model_label |
"Model" |
chat.model_placeholder |
"Default model" |
Provider-Model Compatibility Validation
The critical constraint is LiteLLM model-string routing. LiteLLM uses the provider/model-name prefix to determine which SDK client to use:
openai/gpt-5-nano→ OpenAI client (with customapi_basefor Zen)anthropic/claude-sonnet-4-20250514→ Anthropic clientgroq/llama-3.3-70b-versatile→ Groq client
If user types anthropic/claude-opus for openai provider, LiteLLM uses Anthropic SDK with OpenAI credentials → guaranteed failure.
Recommended backend guard in send_message():
if model:
expected_prefix = provider_config["default_model"].split("/")[0]
if not model.startswith(expected_prefix + "/"):
return Response(
{"error": f"Model must use '{expected_prefix}/' prefix for provider '{provider}'."},
status=status.HTTP_400_BAD_REQUEST,
)
Exception: opencode_zen and openrouter accept any prefix (they're routing gateways). Guard should skip prefix check when api_base is set (custom gateway).
Migration Requirement
NO migration required for the recommended localStorage approach.