docs: update docs and AGENTS.md with chat tool loop fix patterns

2026-03-10 18:40:29 +00:00
parent 0ca73a417d
commit c4b8f291f2
53 changed files with 801 additions and 73 deletions
--- a/Architecture.md
+++ b/Architecture.md
@@ -0,0 +1,59 @@
+---
+title: Chat Tool Error Handling Architecture
+type: note
+permalink: voyage/knowledge/chat-tool-error-handling-architecture
+tags:
+- chat
+- tools
+- error-handling
+- architecture
+- pattern
+---
+
+# Chat Tool Error Handling Architecture
+
+## Overview
+
+The chat agent tool loop classifies tool call outcomes into three distinct categories, each with different retry and surfacing behavior.
+
+## Error Classification
+
+### 1. Required-parameter validation errors
+- [pattern] Detected by `_is_required_param_tool_error()` regex matching `"... is required"` patterns in tool result `error` field
+- [convention] Short-circuited immediately with a user-visible error — never replayed into LLM history
+- [pattern] `search_places` missing `location` has a special path: `_is_search_places_location_retry_candidate_error()` triggers deterministic context-retry (trip destination → first itinerary stop → user clarification) before surfacing
+
+### 2. Execution failures (new in chat-tool-loop-fix)
+- [pattern] Any `error`-bearing tool result dict that does NOT match the required-param pattern is classified as an execution failure by `_is_execution_failure_tool_error()`
+- [convention] Execution failures are NEVER replayed into LLM context — they are excluded from `successful_tool_calls`, `successful_tool_messages`, and `successful_tool_chat_entries`
+- [pattern] `tool_iterations` increments only after at least one successful tool call in a round
+- [pattern] All-failure rounds (every tool in a round fails) increment `all_failure_rounds`, capped at `MAX_ALL_FAILURE_ROUNDS` (3)
+- [pattern] Permanent failures (`retryable: false` in tool result, e.g. `web_search` ImportError) set `all_failure_rounds = MAX_ALL_FAILURE_ROUNDS` for immediate stop
+- [convention] Execution failures emit a `tool_execution_error` SSE event with sanitized text via `_build_tool_execution_error_event()`
+
+### 3. Geocoding failures in search_places
+- [pattern] `Could not geocode location: ...` errors are detected by `_is_search_places_location_retry_candidate_error()` (same path as missing-location)
+- [convention] Eligible for the existing context-retry fallback before being treated as a terminal failure
+
+## Error Sanitization
+
+- [convention] `_safe_error_payload()` maps LiteLLM exceptions to sanitized user-safe categories — never forwards raw `exc.message`
+- [convention] `execute_tool()` catch-all returns `{"error": "Tool execution failed"}` (hardcoded) — never raw `str(exc)`
+- [decision] The `_build_tool_execution_error_event()` wraps sanitized tool error text in a user-safe sentence for SSE emission and DB persistence
+
+## Frontend Tool-Result Deduplication
+
+- [pattern] Three-layer dedup by `tool_call_id`:
+  1. `rebuildConversationMessages()` sets `tool_results: undefined` on all assistant messages, then re-derives exclusively from persisted `role=tool` sibling rows — discards any server-side pre-populated `tool_results`
+  2. `appendToolResultDedup()` deduplicates during both rebuild walk and live SSE ingestion
+  3. `uniqueToolResultsByCallId()` at render time provides a final safety net
+
+## Key Files
+
+- Backend classification/loop: `backend/server/chat/views/__init__.py`
+- Tool execution + sanitization: `backend/server/chat/agent_tools.py`
+- Frontend dedup: `frontend/src/lib/components/AITravelChat.svelte`
+- Tests: `backend/server/chat/tests.py` (32 total chat tests)
+
+## Relations
+- related_to [[assistant-add-flow-fixes]]