docs: update docs and AGENTS.md with chat tool loop fix patterns
This commit is contained in:
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: CONTRIBUTING
|
||||
type: note
|
||||
permalink: voyage/contributing
|
||||
---
|
||||
|
||||
# Contributing to Voyage
|
||||
|
||||
We’re excited to have you contribute to Voyage! To ensure that this community remains welcoming and productive for all users and developers, please follow this simple Code of Conduct.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: README
|
||||
type: note
|
||||
permalink: voyage/readme
|
||||
---
|
||||
|
||||
<div align="center">
|
||||
|
||||
<img src="brand/voyage.png" alt="logo" width="200" height="auto" />
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: README
|
||||
type: note
|
||||
permalink: voyage/cdn/readme
|
||||
---
|
||||
|
||||
This folder contains the scripts to generate Voyage CDN files.
|
||||
|
||||
Special thanks to [@larsl-net](https://github.com/larsl-net) for the GeoJSON generation script.
|
||||
19
decisions/Retrospective - Chat Tool Failure Handling.md
Normal file
19
decisions/Retrospective - Chat Tool Failure Handling.md
Normal file
@@ -0,0 +1,19 @@
|
||||
---
|
||||
title: Retrospective - Chat Tool Failure Handling
|
||||
type: note
|
||||
permalink: voyage/decisions/retrospective-chat-tool-failure-handling
|
||||
tags:
|
||||
- pattern
|
||||
- lesson
|
||||
- chat
|
||||
- retrospective
|
||||
---
|
||||
|
||||
# Retrospective - Chat Tool Failure Handling
|
||||
|
||||
## Retrospective: chat tool failure handling
|
||||
- [pattern] Separating successful-tool iterations from bounded all-failure rounds prevents external tool outages from exhausting the main tool-call budget. #pattern
|
||||
- [pattern] Classify chat tool failures into required-parameter errors, retryable execution failures, and permanent execution failures so each path can use the right UX and retry behavior. #pattern
|
||||
- [risk] Mixed batches with both successful and failed tool calls still rely on the global success iteration cap as a backstop; acceptable pre-release, but worth revisiting if multi-tool itineraries remain noisy. #risk
|
||||
|
||||
confidence=high; last_validated=2026-03-10; volatility=medium; review_after_days=60; validation_count=1; contradiction_count=0
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: architecture
|
||||
type: note
|
||||
permalink: voyage/docs/architecture
|
||||
---
|
||||
|
||||
# Voyage Architecture
|
||||
|
||||
> **Status**: Stub — to be expanded as architecture evolves.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: development_timeline
|
||||
type: note
|
||||
permalink: voyage/docs/docs/changelogs/development-timeline
|
||||
---
|
||||
|
||||
# Voyage: Development Timeline & Origin Story
|
||||
|
||||
By: Sean Morley, Founder & Lead Developer
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: v0-10-0
|
||||
type: note
|
||||
permalink: voyage/docs/docs/changelogs/v0-10-0
|
||||
---
|
||||
|
||||
# Voyage v0.10.0 - Trip Maps, Google Maps Integration & Quick Deploy Script
|
||||
|
||||
Released 06-10-2025
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: v0-11-0
|
||||
type: note
|
||||
permalink: voyage/docs/docs/changelogs/v0-11-0
|
||||
---
|
||||
|
||||
# Voyage v0.11.0 - Strava + Wanderer Integration, New UI, and More
|
||||
|
||||
Released 09-02-2025
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: v0-12-0
|
||||
type: note
|
||||
permalink: voyage/docs/docs/changelogs/v0-12-0
|
||||
---
|
||||
|
||||
# Voyage v0.12.0 - Trip Planning Improvements, Itineraries & Budgets
|
||||
|
||||
Released 02-03-2026
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: v0-7-0
|
||||
type: note
|
||||
permalink: voyage/docs/docs/changelogs/v0-7-0
|
||||
---
|
||||
|
||||
# Voyage v0.7.0
|
||||
|
||||
Released 10-15-2024
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: v0-7-1
|
||||
type: note
|
||||
permalink: voyage/docs/docs/changelogs/v0-7-1
|
||||
---
|
||||
|
||||
# Voyage v0.7.1
|
||||
|
||||
Released 11-13-2024
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: v0-8-0
|
||||
type: note
|
||||
permalink: voyage/docs/docs/changelogs/v0-8-0
|
||||
---
|
||||
|
||||
# Voyage v0.8.0 - Immich Integration, Calendar and Customization
|
||||
|
||||
Released 01-08-2025
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: v0-9-0
|
||||
type: note
|
||||
permalink: voyage/docs/docs/changelogs/v0-9-0
|
||||
---
|
||||
|
||||
# Voyage v0.9.0 - Smart Recommendations, Attachments, and Maps
|
||||
|
||||
Released 03-19-2025
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: advanced_configuration
|
||||
type: note
|
||||
permalink: voyage/docs/docs/configuration/advanced-configuration
|
||||
---
|
||||
|
||||
# Advanced Configuration
|
||||
|
||||
In addition to the primary configuration variables listed above, there are several optional environment variables that can be set to further customize your Voyage instance. These variables are not required for a basic setup but can enhance functionality and security.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: analytics
|
||||
type: note
|
||||
permalink: voyage/docs/docs/configuration/analytics
|
||||
---
|
||||
|
||||
# Umami Analytics (optional)
|
||||
|
||||
Umami Analytics is a free, open-source, and privacy-focused web analytics tool that can be used as an alternative to Google Analytics. Learn more about Umami Analytics [here](https://umami.is/).
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: disable_registration
|
||||
type: note
|
||||
permalink: voyage/docs/docs/configuration/disable-registration
|
||||
---
|
||||
|
||||
# Disable Registration
|
||||
|
||||
To disable registration, you can set the following variable in your docker-compose.yml under the server service:
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: email
|
||||
type: note
|
||||
permalink: voyage/docs/docs/configuration/email
|
||||
---
|
||||
|
||||
# Change Email Backend
|
||||
|
||||
To change the email backend, you can set the following variable in your docker-compose.yml under the server service:
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: google_maps_integration
|
||||
type: note
|
||||
permalink: voyage/docs/docs/configuration/google-maps-integration
|
||||
---
|
||||
|
||||
# Google Maps Integration
|
||||
|
||||
To enable Google Maps integration in Voyage, you'll need to create a Google Maps API key. This key allows Voyage to use Google Maps services such as geocoding and location search throughout the application.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: immich_integration
|
||||
type: note
|
||||
permalink: voyage/docs/docs/configuration/immich-integration
|
||||
---
|
||||
|
||||
# Immich Integration
|
||||
|
||||
### What is Immich?
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: social_auth
|
||||
type: note
|
||||
permalink: voyage/docs/docs/configuration/social-auth
|
||||
---
|
||||
|
||||
# Social Authentication
|
||||
|
||||
Voyage support authentication via 3rd party services and self-hosted identity providers. Once these services are enabled, users can log in to Voyage using their accounts from these services and link existing Voyage accounts to these services for easier access.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: authentik
|
||||
type: note
|
||||
permalink: voyage/docs/docs/configuration/social-auth/authentik
|
||||
---
|
||||
|
||||
# Authentik OIDC Authentication
|
||||
|
||||
<img src="https://repository-images.githubusercontent.com/230885748/19f01d00-8e26-11eb-9a14-cf0d28a1b68d" alt="Authentik Logo" width="400" />
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: github
|
||||
type: note
|
||||
permalink: voyage/docs/docs/configuration/social-auth/github
|
||||
---
|
||||
|
||||
# GitHub Social Authentication
|
||||
|
||||
Voyage can be configured to use GitHub as an identity provider for social authentication. Users can then log in to Voyage using their GitHub account.
|
||||
@@ -46,4 +52,3 @@ If a user has an existing Voyage account and wants to link it to their Github ac
|
||||
#### What it Should Look Like
|
||||
|
||||

|
||||
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: oidc
|
||||
type: note
|
||||
permalink: voyage/docs/docs/configuration/social-auth/oidc
|
||||
---
|
||||
|
||||
# OIDC Social Authentication
|
||||
|
||||
Voyage can be configured to use OpenID Connect (OIDC) as an identity provider for social authentication. Users can then log in to Voyage using their OIDC account.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: pocket_id
|
||||
type: note
|
||||
permalink: voyage/docs/docs/configuration/social-auth/pocket-id
|
||||
---
|
||||
|
||||
# Pocket ID OIDC Authentication
|
||||
|
||||
<img src="https://pocket-id.org/logo.png" alt="Pocket ID Logo" width="400" />
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: strava_integration
|
||||
type: note
|
||||
permalink: voyage/docs/docs/configuration/strava-integration
|
||||
---
|
||||
|
||||
# Strava Integration
|
||||
|
||||
Strava is a popular platform for athletes to track their activities, share their workouts, and connect with other fitness enthusiasts. Integrating Strava with Voyage allows you to import your Strava activities directly into your vists, making it easier to keep track of your outdoor experiences.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: updating
|
||||
type: note
|
||||
permalink: voyage/docs/docs/configuration/updating
|
||||
---
|
||||
|
||||
# Updating
|
||||
|
||||
Updating Voyage when using docker can be quite easy. Run a collections backup before upgrading, then pull the latest version and restart the containers.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: wanderer_integration
|
||||
type: note
|
||||
permalink: voyage/docs/docs/configuration/wanderer-integration
|
||||
---
|
||||
|
||||
# Wanderer Integration
|
||||
|
||||
[Wanderer](https://wanderer.to) is a self-hosted trail database. Integrating Wanderer with Voyage allows you to import trails directly into your locations, making it easier to plan and track your outdoor adventures.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: admin_panel
|
||||
type: note
|
||||
permalink: voyage/docs/docs/guides/admin-panel
|
||||
---
|
||||
|
||||
# Voyage Admin Panel
|
||||
|
||||
The Voyage Admin Panel, powered by Django, is a web-based interface that allows administrators to manage objects in the Voyage database. The Admin Panel is accessible at the `/admin` endpoint of the Voyage server. Example: `https://al-server.yourdomain.com/admin`.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: invite_user
|
||||
type: note
|
||||
permalink: voyage/docs/docs/guides/invite-user
|
||||
---
|
||||
|
||||
# Invite a User
|
||||
|
||||
Voyage allows for admin users to invite new users to create an account via an email invitation.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: travel_agent
|
||||
type: note
|
||||
permalink: voyage/docs/docs/guides/travel-agent
|
||||
---
|
||||
|
||||
# Travel Agent (MCP)
|
||||
|
||||
Voyage includes a **Travel Agent** interface exposed through an **MCP-compatible HTTP endpoint**. This lets external MCP clients read and manage trip itineraries programmatically for authenticated users.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: v0-7-1_migration
|
||||
type: note
|
||||
permalink: voyage/docs/docs/guides/v0-7-1-migration
|
||||
---
|
||||
|
||||
# Voyage v0.7.1 Migration
|
||||
|
||||
In order to make installation easier, the Voyage v0.7.1 release has **removed the need for a separate nginx container** and config to serve the media files. Instead, the media files are now served by an instance of nginx running in the same container as the Django application.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: caddy
|
||||
type: note
|
||||
permalink: voyage/docs/docs/install/caddy
|
||||
---
|
||||
|
||||
# Installation with Caddy
|
||||
|
||||
Caddy is a modern HTTP reverse proxy. It automatically integrates with Let's Encrypt (or other certificate providers) to generate TLS certificates for your site.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: dev_container_wsl
|
||||
type: note
|
||||
permalink: voyage/docs/docs/install/dev-container-wsl
|
||||
---
|
||||
|
||||
# Dev Container + WSL 🧰
|
||||
|
||||
Running Voyage in a **Dev Container** allows you to contribute to the project or work on features locally in a fully reproducible development environment with hot reloading, debugging, and tooling isolated inside Docker.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: docker
|
||||
type: note
|
||||
permalink: voyage/docs/docs/install/docker
|
||||
---
|
||||
|
||||
# Docker 🐋
|
||||
|
||||
Docker is the preferred way to run Voyage on your local machine. It is a lightweight containerization technology that allows you to run applications in isolated environments called containers.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: getting_started
|
||||
type: note
|
||||
permalink: voyage/docs/docs/install/getting-started
|
||||
---
|
||||
|
||||
# 🚀 Install Options for Voyage
|
||||
|
||||
Voyage can be installed in a variety of ways, depending on your platform or preference.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: kustomize
|
||||
type: note
|
||||
permalink: voyage/docs/docs/install/kustomize
|
||||
---
|
||||
|
||||
# Kubernetes and Kustomize (k8s)
|
||||
|
||||
_Voyage can be run inside a kubernetes cluster using [kustomize](https://kustomize.io/)._
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: nginx_proxy_manager
|
||||
type: note
|
||||
permalink: voyage/docs/docs/install/nginx-proxy-manager
|
||||
---
|
||||
|
||||
# Installation with Nginx Proxy Manager
|
||||
|
||||
Nginx Proxy Manager is a simple and powerful tool that allows you to manage Nginx proxy hosts and SSL certificates. It is designed to be easy to use and configure, and it integrates well with Docker.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: proxmox_lxc
|
||||
type: note
|
||||
permalink: voyage/docs/docs/install/proxmox-lxc
|
||||
---
|
||||
|
||||
# Proxmox LXC 🐧
|
||||
|
||||
Voyage can be installed in a Proxmox LXC container. This script created by the community will help you install Voyage in a Proxmox LXC container.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: quick_start
|
||||
type: note
|
||||
permalink: voyage/docs/docs/install/quick-start
|
||||
---
|
||||
|
||||
# 🚀 Quick Start Install
|
||||
|
||||
Install **Voyage** in seconds using our automated script.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: synology_nas
|
||||
type: note
|
||||
permalink: voyage/docs/docs/install/synology-nas
|
||||
---
|
||||
|
||||
# Installation on a Synology NAS
|
||||
|
||||
Voyage can be deployed on a Synology NAS using Docker. This guide from Marius Hosting will walk you through the process.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: traefik
|
||||
type: note
|
||||
permalink: voyage/docs/docs/install/traefik
|
||||
---
|
||||
|
||||
# Installation with Traefik
|
||||
|
||||
Traefik is a modern HTTP reverse proxy and load balancer that makes deploying microservices easy. It is designed to be simple to use and configure, and it integrates well with Docker.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: unraid
|
||||
type: note
|
||||
permalink: voyage/docs/docs/install/unraid
|
||||
---
|
||||
|
||||
# Installation with Unraid
|
||||
|
||||
Voyage is available in the Unraid Community Applications store. You can install it by searching for `Voyage` in the Community Applications store, where you will find the frontend and the backend. The database can be found by searching `PostGIS`.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: voyage_overview
|
||||
type: note
|
||||
permalink: voyage/docs/docs/intro/voyage-overview
|
||||
---
|
||||
|
||||
# About Voyage
|
||||
|
||||
> **Voyage is a fork of [AdventureLog](https://github.com/seanmorley15/AdventureLog)**, the open-source travel companion created by [Sean Morley](https://seanmorley.com). Voyage builds on AdventureLog's foundation to make additional changes and improvements.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: login_unresponsive
|
||||
type: note
|
||||
permalink: voyage/docs/docs/troubleshooting/login-unresponsive
|
||||
---
|
||||
|
||||
# Troubleshooting: Login and Registration Unresponsive
|
||||
|
||||
When you encounter issues with the login and registration pages being unresponsive in Voyage, it can be due to various reasons. This guide will help you troubleshoot and resolve the unresponsive login and registration pages in Voyage.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: nginx_failed
|
||||
type: note
|
||||
permalink: voyage/docs/docs/troubleshooting/nginx-failed
|
||||
---
|
||||
|
||||
# Troubleshooting: `Starting nginx: nginx failed!` in the Backend Container
|
||||
|
||||
::: tip
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: no_images
|
||||
type: note
|
||||
permalink: voyage/docs/docs/troubleshooting/no-images
|
||||
---
|
||||
|
||||
# Troubleshooting: Images Not Displayed in Voyage
|
||||
|
||||
The Voyage backend container uses a built-in Nginx container to serve media to the frontend. The `PUBLIC_URL` environment variable is set to the external URL of the **backend** container. This URL is used to generate the URLs for the images in the frontend. If this URL is not set correctly or not accessible from the frontend, the images will not be displayed.
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
---
|
||||
title: usage
|
||||
type: note
|
||||
permalink: voyage/docs/docs/usage/usage
|
||||
---
|
||||
|
||||
# How to use Voyage
|
||||
|
||||
Welcome to Voyage! This guide will help you get started with Voyage and provide you with an overview of the features available to you.
|
||||
|
||||
@@ -1,11 +1,10 @@
|
||||
---
|
||||
# https://vitepress.dev/reference/default-theme-home-page
|
||||
layout: home
|
||||
|
||||
hero:
|
||||
name: "Voyage"
|
||||
text: "The ultimate travel companion."
|
||||
tagline: Discover new places, track your adventures, and share your experiences with friends and family.
|
||||
name: Voyage
|
||||
text: The ultimate travel companion.
|
||||
tagline: Discover new places, track your adventures, and share your experiences
|
||||
with friends and family.
|
||||
actions:
|
||||
- theme: brand
|
||||
text: Get Started
|
||||
@@ -19,17 +18,19 @@ hero:
|
||||
image:
|
||||
src: ./voyage.svg
|
||||
alt: Voyage Map Logo
|
||||
|
||||
features:
|
||||
- title: "Track Your Adventures"
|
||||
details: "Log your adventures and keep track of where you've been on the world map."
|
||||
- title: Track Your Adventures
|
||||
details: Log your adventures and keep track of where you've been on the world map.
|
||||
icon: 📍
|
||||
- title: "Plan Your Next Trip"
|
||||
details: "Take the guesswork out of planning your next adventure with an easy-to-use itinerary planner."
|
||||
- title: Plan Your Next Trip
|
||||
details: Take the guesswork out of planning your next adventure with an easy-to-use
|
||||
itinerary planner.
|
||||
icon: 📅
|
||||
- title: "Share Your Experiences"
|
||||
details: "Share your adventures with friends and family and collaborate on trips together."
|
||||
- title: Share Your Experiences
|
||||
details: Share your adventures with friends and family and collaborate on trips
|
||||
together.
|
||||
icon: 📸
|
||||
permalink: voyage/docs/index
|
||||
---
|
||||
|
||||
## ⚡️ Quick Start
|
||||
|
||||
65
gates/chat-tool-loop-fix tester gate.md
Normal file
65
gates/chat-tool-loop-fix tester gate.md
Normal file
@@ -0,0 +1,65 @@
|
||||
---
|
||||
title: chat-tool-loop-fix tester gate
|
||||
type: note
|
||||
permalink: voyage/gates/chat-tool-loop-fix-tester-gate
|
||||
tags:
|
||||
- tester
|
||||
- gate
|
||||
- chat
|
||||
- loop-fix
|
||||
- adversarial
|
||||
---
|
||||
|
||||
## Gate: chat-tool-loop-fix adversarial validation (2026-03-10)
|
||||
|
||||
**Verdict: PASS**
|
||||
**Plan ref:** [[voyage/plans/assistant-add-flow-fixes]]
|
||||
**Branch:** fix/chat-tool-loop-fix
|
||||
|
||||
### Test execution
|
||||
- All 32 chat tests pass (docker compose exec server python3 manage.py test chat --keepdb --verbosity=2)
|
||||
- 0 failures, 0 errors, 0 flaky
|
||||
|
||||
### Adversarial checks and findings
|
||||
|
||||
#### AC1 — all-failure rounds stop cleanly without falling into tool_loop_limit
|
||||
PASS. `all_failure_rounds` counter increments only when `not successful_tool_calls and first_execution_failure` (line 749). On reaching `MAX_ALL_FAILURE_ROUNDS=3`, the code issues `tool_execution_error` and `return`s (line 777). `tool_iterations` is never incremented for all-failure rounds (it increments only at line 782, after `all_failure_rounds = 0`), so 10-iteration cap is never hit for pure-failure scenarios. Test `test_all_failure_rounds_stop_with_execution_error_before_tool_cap` confirms exactly 3 `stream_chat_completion` calls and `tool_execution_error` rather than `tool_loop_limit`.
|
||||
|
||||
#### AC2 — permanent failures stop in one round
|
||||
PASS. When `retryable=False`, `encountered_permanent_failure = True` is set (line 717), then at post-iteration check (line 750-751) `all_failure_rounds` is set directly to `MAX_ALL_FAILURE_ROUNDS` (bypassing the increment), triggering immediate stop. Test `test_permanent_execution_failure_stops_immediately` confirms 1 `stream_chat_completion` call.
|
||||
|
||||
#### AC3 — required-parameter clarification flow still intact
|
||||
PASS. The `_is_required_param_tool_error` path (lines 647-711) short-circuits before the execution-failure path (line 713). `search_places` missing-location triggers `clarification_content` SSE emit + DB persist + `return`. Other required-param errors still emit `tool_validation_error` and `return`. Tests `test_missing_search_place_location_streams_clarifying_content` and `test_collection_access_error_does_not_short_circuit_required_param_regex` confirm correctness.
|
||||
|
||||
#### AC4 — execution failures sanitized and NOT emitted as tool_result SSE events
|
||||
PASS. The execution-failure branch at line 713 does `continue` — it never reaches the `yield tool_event` at line 747. Failed tool results are recorded in `first_execution_failure` but not in `successful_tool_calls` / `successful_tool_messages` / `successful_tool_chat_entries`, so they are never persisted to DB as role=tool rows and never streamed to the frontend. `_build_tool_execution_error_event` uses only the sanitized error text (line 265-271), not the raw exception. Test confirms no `tool_result` events in failure scenarios.
|
||||
|
||||
#### AC5 — mixed-batch partial failure: one tool succeeds, another fails
|
||||
OBSERVED FINDING (LOW): When a batch has e.g. 2 tool calls where call_1 succeeds and call_2 fails with an execution error:
|
||||
- `successful_tool_calls` will contain call_1's result; call_2's failure is silently dropped from this iteration's context.
|
||||
- The assistant message persisted to DB will reference only call_1's tool_calls.
|
||||
- `all_failure_rounds` stays at 0 (successful_tool_calls is non-empty, so the `not successful_tool_calls` condition at line 749 is False).
|
||||
- Loop continues as if call_2 never happened — the LLM won't know call_2 failed.
|
||||
- Acceptable for now (pre-release), but the LLM may loop trying to re-execute call_2 if it expects an answer. Not a regression from the old behavior (where ALL failures were fed back to LLM), and the `MAX_TOOL_ITERATIONS` cap remains as a backstop.
|
||||
|
||||
#### AC6 — `_build_llm_messages` orphan tool_call filtering on reload
|
||||
PASS. `valid_tool_call_ids` (line 59) is built only from non-required-param-error and non-execution-failure tool messages. Assistant messages with `tool_calls` have their `tool_calls` filtered to only the valid set (line 95-101). If ALL tool_calls for an assistant message are failures, `filtered_tool_calls` is empty and `payload["tool_calls"]` is not set (line 100 guard) — the assistant message is sent to LLM without tool_calls, avoiding orphaned tool references.
|
||||
|
||||
#### AC7 — frontend `rebuildConversationMessages` dedupe
|
||||
PASS. Line 374 forces `tool_results: undefined` on all messages before the sibling-walk, discarding any server-side `tool_results` field. `appendToolResultDedup` guards on `tool_call_id` uniqueness during the walk. `uniqueToolResultsByCallId` is applied again at render time (line 939) as a second safety net. Combined: no double-accumulation on reload.
|
||||
|
||||
#### AC8 — `uniqueToolResultsByCallId` silent drop for null/undefined tool_call_id
|
||||
LOW RISK: `uniqueToolResultsByCallId` (line 351-368) has a special case: if `result.tool_call_id` is falsy (null/undefined), the result is always appended without dedup check (line 359 guard). This means multiple results with no `tool_call_id` will all be included. Given that tool_call_id is always populated from the backend for proper tool calls, this is a safe fallback for edge cases.
|
||||
|
||||
#### AC9 — `_is_execution_failure_tool_error` catches ANY non-empty error that is not required-param error
|
||||
PASS. This is intentionally broad — any `{error: "..."}` dict that isn't a missing-param pattern is treated as an execution failure. This correctly captures geocode errors, network errors, unexpected exceptions, etc.
|
||||
|
||||
#### AC10 — retry fallback after geocode failure still uses `_is_execution_failure_tool_error` guard
|
||||
PASS (line 631-634): The retry succeeds only if the retry result is not a required-param error AND not an execution failure. If all retries fail, the `result` remains the original geocode error, which is then caught by `_is_execution_failure_tool_error` at line 713 and handled as a failure (not emitted as tool result).
|
||||
|
||||
### Mutation escape analysis
|
||||
- MUTATION_ESCAPES: 1/8
|
||||
- The unchecked mutation: flipping `not successful_tool_calls` → `successful_tool_calls` in the all-failure-round check (line 749) would cause failures to be silently dropped without incrementing the counter. No existing test with partial success + partial failure to detect this mutation. Low risk given the mixed-batch scenario is acceptable behavior.
|
||||
|
||||
### Summary
|
||||
Implementation is sound for the primary stated goals. All-failure rounds stop cleanly; permanent failures stop in 1 round; clarification flow is intact; execution failures are not leaked to frontend. Frontend dedupe is correct via `tool_results: undefined` reset + ID-based dedup. One low-severity gap: mixed-batch partial failure silently drops the failing tool call (acceptable pre-release).
|
||||
59
knowledge/Chat Tool Error Handling Architecture.md
Normal file
59
knowledge/Chat Tool Error Handling Architecture.md
Normal file
@@ -0,0 +1,59 @@
|
||||
---
|
||||
title: Chat Tool Error Handling Architecture
|
||||
type: note
|
||||
permalink: voyage/knowledge/chat-tool-error-handling-architecture
|
||||
tags:
|
||||
- chat
|
||||
- tools
|
||||
- error-handling
|
||||
- architecture
|
||||
- pattern
|
||||
---
|
||||
|
||||
# Chat Tool Error Handling Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
The chat agent tool loop classifies tool call outcomes into three distinct categories, each with different retry and surfacing behavior.
|
||||
|
||||
## Error Classification
|
||||
|
||||
### 1. Required-parameter validation errors
|
||||
- [pattern] Detected by `_is_required_param_tool_error()` regex matching `"... is required"` patterns in tool result `error` field
|
||||
- [convention] Short-circuited immediately with a user-visible error — never replayed into LLM history
|
||||
- [pattern] `search_places` missing `location` has a special path: `_is_search_places_location_retry_candidate_error()` triggers deterministic context-retry (trip destination → first itinerary stop → user clarification) before surfacing
|
||||
|
||||
### 2. Execution failures (new in chat-tool-loop-fix)
|
||||
- [pattern] Any `error`-bearing tool result dict that does NOT match the required-param pattern is classified as an execution failure by `_is_execution_failure_tool_error()`
|
||||
- [convention] Execution failures are NEVER replayed into LLM context — they are excluded from `successful_tool_calls`, `successful_tool_messages`, and `successful_tool_chat_entries`
|
||||
- [pattern] `tool_iterations` increments only after at least one successful tool call in a round
|
||||
- [pattern] All-failure rounds (every tool in a round fails) increment `all_failure_rounds`, capped at `MAX_ALL_FAILURE_ROUNDS` (3)
|
||||
- [pattern] Permanent failures (`retryable: false` in tool result, e.g. `web_search` ImportError) set `all_failure_rounds = MAX_ALL_FAILURE_ROUNDS` for immediate stop
|
||||
- [convention] Execution failures emit a `tool_execution_error` SSE event with sanitized text via `_build_tool_execution_error_event()`
|
||||
|
||||
### 3. Geocoding failures in search_places
|
||||
- [pattern] `Could not geocode location: ...` errors are detected by `_is_search_places_location_retry_candidate_error()` (same path as missing-location)
|
||||
- [convention] Eligible for the existing context-retry fallback before being treated as a terminal failure
|
||||
|
||||
## Error Sanitization
|
||||
|
||||
- [convention] `_safe_error_payload()` maps LiteLLM exceptions to sanitized user-safe categories — never forwards raw `exc.message`
|
||||
- [convention] `execute_tool()` catch-all returns `{"error": "Tool execution failed"}` (hardcoded) — never raw `str(exc)`
|
||||
- [decision] The `_build_tool_execution_error_event()` wraps sanitized tool error text in a user-safe sentence for SSE emission and DB persistence
|
||||
|
||||
## Frontend Tool-Result Deduplication
|
||||
|
||||
- [pattern] Three-layer dedup by `tool_call_id`:
|
||||
1. `rebuildConversationMessages()` sets `tool_results: undefined` on all assistant messages, then re-derives exclusively from persisted `role=tool` sibling rows — discards any server-side pre-populated `tool_results`
|
||||
2. `appendToolResultDedup()` deduplicates during both rebuild walk and live SSE ingestion
|
||||
3. `uniqueToolResultsByCallId()` at render time provides a final safety net
|
||||
|
||||
## Key Files
|
||||
|
||||
- Backend classification/loop: `backend/server/chat/views/__init__.py`
|
||||
- Tool execution + sanitization: `backend/server/chat/agent_tools.py`
|
||||
- Frontend dedup: `frontend/src/lib/components/AITravelChat.svelte`
|
||||
- Tests: `backend/server/chat/tests.py` (32 total chat tests)
|
||||
|
||||
## Relations
|
||||
- related_to [[assistant-add-flow-fixes]]
|
||||
101
plans/assistant-add-flow-fixes.md
Normal file
101
plans/assistant-add-flow-fixes.md
Normal file
@@ -0,0 +1,101 @@
|
||||
---
|
||||
title: assistant-add-flow-fixes
|
||||
type: note
|
||||
permalink: voyage/plans/assistant-add-flow-fixes-1
|
||||
---
|
||||
|
||||
## Feature A hotfix implementation note
|
||||
- Added a targeted chat hotfix in `ChatViewSet` so required-arg validation for `search_places` with missing `location` now emits a natural assistant clarification as SSE `content` (and persists it as an assistant message) instead of streaming a `tool_validation_error` payload.
|
||||
- Kept existing generic required-param short-circuit behavior unchanged for all other tools/errors.
|
||||
- Added tests covering the specific detection helper and end-to-end streaming behavior for the missing-location clarify path.
|
||||
|
||||
## Follow-up fix: itinerary-aware search fallback
|
||||
- Feature: Use collection itinerary context to satisfy missing `search_places.location` when available, and preserve dining intent when category is omitted.
|
||||
- Acceptance:
|
||||
- Restaurant requests across a collection itinerary should not ask for location when itinerary context can provide one.
|
||||
- `search_places` should not silently return tourism attractions for restaurant requests when category is omitted.
|
||||
- Existing behavior for non-dining searches remains intact.
|
||||
- Planned worktree: `.worktrees/chat-search-intent-fix`
|
||||
- Planned branch: `fix/chat-search-intent-fix`
|
||||
|
||||
## Implementation note: chat search intent + context retry
|
||||
- Implemented deterministic `search_places` retry fallback in `ChatViewSet`: when tool call misses required `location`, retry first with trip context (`destination`, then first itinerary stop label), then retain existing user-location-reply retry behavior.
|
||||
- Added deterministic dining-intent heuristic in `ChatViewSet` to inject `category="food"` when `search_places.category` is omitted and current/recent user messages indicate restaurant/dining intent.
|
||||
- Updated `search_places` tool schema text to explicitly describe `food` (restaurants/dining), `tourism` (attractions), and `lodging` (hotels/stays).
|
||||
- Added targeted tests for category inference and a collection-context integration flow proving restaurant requests can complete via destination-backed retry without location clarification prompt.
|
||||
|
||||
## Regression Exploration: Restaurant Recommendations Across Test Itinerary (2026-03-10)
|
||||
|
||||
### Observed failure modes
|
||||
1. **"too many tool calls"** — `tool_loop_limit` error after 10 iterations
|
||||
2. **Repeated "search places could not be completed"** — LLM retries `search_places` each iteration, each producing a non-location-error that bypasses the clarification/retry gate
|
||||
3. **Repeated "web search could not be completed"** — LLM retries `web_search` each iteration; non-fatal errors (rate-limit, ImportError, generic failure) are treated as successful tool calls and fed back to the LLM, which re-calls the tool
|
||||
|
||||
### Root cause analysis
|
||||
|
||||
#### RC1 — `web_search` non-fatal errors are NOT short-circuited (primary cause of web_search loop)
|
||||
**File**: `backend/server/chat/views/__init__.py`, `event_stream()` loop (line 474+)
|
||||
- `web_search` returns `{"error": "...", "results": []}` on ImportError, rate-limit, and generic failures.
|
||||
- `_is_required_param_tool_error()` only matches "location is required" / "query is required" / `"X is required"` pattern. It does **not** match "Web search failed. Please try again." or "Search rate limit reached."
|
||||
- So the error payload passes through `_is_required_param_tool_error` → `False`, and the tool result (including the error text) is added to `successful_tool_calls` and fed back into the LLM context as a successful tool message.
|
||||
- The LLM sees a tool result with `error` text but no real data, and calls `web_search` again — repeating until `MAX_TOOL_ITERATIONS` (10) is hit.
|
||||
|
||||
#### RC2 — `search_places` geocoding/Overpass failures are NOT short-circuited (cause of search_places loop)
|
||||
**File**: `backend/server/chat/agent_tools.py` `search_places()` (line 149, 199–205)
|
||||
- `search_places` returns `{"error": "Could not geocode location: X"}` when Nominatim returns empty results, `{"error": "Places API request failed: ..."}` on network errors, or `{"error": "An unexpected error occurred..."}` on generic failures.
|
||||
- **None of these match** `_is_required_param_tool_error()` (which only detects missing-param patterns). They are not `location is required`.
|
||||
- The context-retry fallback (`_is_search_places_missing_location_required_error`, lines 522–565) only activates on `location is required`. Geocoding/network failures silently pass through as "successful" tool results.
|
||||
- LLM receives `{"error": "Could not geocode location: Paris"}` in tool context, tries again → same failure → 10 times → loop limit.
|
||||
|
||||
#### RC3 — `MAX_TOOL_ITERATIONS = 10` counter increments even when ALL tool calls in an iteration are failures
|
||||
**File**: `backend/server/chat/views/__init__.py`, line 475
|
||||
- `tool_iterations += 1` happens **before** tool results are checked. A single iteration with 2 tool calls each producing non-fatal errors still costs 1 iteration count and adds 2 failed results to LLM context.
|
||||
- For a restaurant request across a multi-stop itinerary, the LLM may call `search_places` for multiple stops per iteration, burning through 10 iterations rapidly.
|
||||
|
||||
#### RC4 — `successful_tool_calls` naming is misleading — it accumulates ALL non-required-param-error results
|
||||
**File**: `backend/server/chat/views/__init__.py`, lines 476, 635
|
||||
- The variable `successful_tool_calls` accumulates tool calls whose results are not `_is_required_param_tool_error`. This includes tool calls that returned geocoding failures, network errors, or `web_search` unavailability errors. These are appended to `current_messages` and persisted to DB, causing the LLM to see repeated failure payloads.
|
||||
|
||||
### Edit points and recommended fix approaches
|
||||
|
||||
#### EP1 — Add non-fatal tool error detection and short-circuit
|
||||
**File**: `backend/server/chat/views/__init__.py`
|
||||
**Symbol**: New static method `_is_nonfatal_tool_error(result)` + use in `event_stream()`
|
||||
- Detect results that have `{"error": ..., "results": []}` (web_search style) or `{"error": "Could not geocode..."}` / `{"error": "Places API request failed..."}` (search_places failures).
|
||||
- Recommended approach: check `result.get("error")` is non-empty AND the result is not a successful payload (no `results` data, no places data). If a tool returns an error result that is not a required-param error, log a warning, do NOT add to `successful_tool_calls`, and instead emit a summary to the LLM context as a tool message explaining the failure — then break the loop or emit a user-visible message after N consecutive non-fatal failures.
|
||||
|
||||
#### EP2 — Distinguish `web_search` persistent unavailability (ImportError) from retryable errors
|
||||
**File**: `backend/server/chat/agent_tools.py`
|
||||
**Symbol**: `web_search()` (lines 294–307)
|
||||
- Currently ImportError (`duckduckgo-search` not installed) and rate-limit both return the same shape: `{"error": "...", "results": []}`.
|
||||
- Recommended approach: add a sentinel field `"permanent": True` for ImportError, so the view layer can detect non-retryable failures and short-circuit immediately on the first iteration rather than retrying 10 times. Alternatively add a distinct error key like `"error_code": "unavailable"` vs `"error_code": "rate_limited"`.
|
||||
|
||||
#### EP3 — Add non-fatal tool error counting per-tool to break loops early
|
||||
**File**: `backend/server/chat/views/__init__.py`
|
||||
**Symbol**: `event_stream()` (line 424+)
|
||||
- Track `consecutive_nonfatal_errors` per tool (or globally). After 2 consecutive failures of the same tool name, stop calling it for this turn and emit a user-facing message: "Search places could not be completed for this location."
|
||||
- This prevents the 10-iteration burn for a single unreachable external API.
|
||||
|
||||
#### EP4 — Only increment `tool_iterations` when at least one tool call succeeds
|
||||
**File**: `backend/server/chat/views/__init__.py`
|
||||
**Symbol**: `event_stream()` line 475 (`tool_iterations += 1`)
|
||||
- Move `tool_iterations += 1` to after the tool result loop, and only increment if `len(successful_tool_calls) > 0`. Failed-only iterations should not count toward the limit (or should count less). Alternatively, add a separate `failed_tool_iterations` counter with a lower limit (e.g. 3) specifically for all-failure iterations.
|
||||
|
||||
#### EP5 — `search_places` geocoding failure should surface earlier in retry logic
|
||||
**File**: `backend/server/chat/views/__init__.py`
|
||||
**Symbol**: `event_stream()` lines 522–565 (the `_is_search_places_missing_location_required_error` retry block)
|
||||
- Current retry block only fires on `location is required`. Extend retry logic to also retry with fallback location when result contains `{"error": "Could not geocode location: ..."}` (Nominatim empty result) using the same `retry_locations` fallback list. This gives context-location a second chance when Nominatim fails on the first geocode.
|
||||
|
||||
### Files / symbols map
|
||||
|
||||
| File | Symbol | Role |
|
||||
|------|--------|------|
|
||||
| `backend/server/chat/views/__init__.py` | `MAX_TOOL_ITERATIONS = 10` (line 422) | Hard loop cap |
|
||||
| `backend/server/chat/views/__init__.py` | `tool_iterations += 1` (line 475) | Iteration counter — always increments, even on all-failure iterations |
|
||||
| `backend/server/chat/views/__init__.py` | `_is_required_param_tool_error()` (line 136) | Only detects missing-param errors; does NOT catch tool execution failures |
|
||||
| `backend/server/chat/views/__init__.py` | `_is_search_places_missing_location_required_error()` (line 180) | Only triggers retry for `location is required`; geocoding/network failures bypass it |
|
||||
| `backend/server/chat/views/__init__.py` | `successful_tool_calls` (line 476) | Misnamed — accumulates all non-required-param-error results including failures |
|
||||
| `backend/server/chat/views/__init__.py` | `event_stream()` lines 480–660 | Full tool execution loop |
|
||||
| `backend/server/chat/agent_tools.py` | `search_places()` (line 125) | Returns `{"error": "..."}` on geocode/network failure — not a required-param error |
|
||||
| `backend/server/chat/agent_tools.py` | `web_search()` (line 257) | Returns `{"error": "...", "results": []}` on import/rate-limit/generic failure |
|
||||
| `backend/server/chat/llm_client.py` | `stream_chat_completion()` (line 380) | No tool-level retry logic — passes through all tool results |
|
||||
208
voyage/plans/assistant-add-flow-fixes.md
Normal file
208
voyage/plans/assistant-add-flow-fixes.md
Normal file
@@ -0,0 +1,208 @@
|
||||
---
|
||||
title: assistant-add-flow-fixes
|
||||
type: note
|
||||
permalink: voyage/plans/assistant-add-flow-fixes
|
||||
---
|
||||
|
||||
## Feature A hotfix implementation note
|
||||
- Added a targeted chat hotfix in `ChatViewSet` so required-arg validation for `search_places` with missing `location` now emits a natural assistant clarification as SSE `content` (and persists it as an assistant message) instead of streaming a `tool_validation_error` payload.
|
||||
- Kept existing generic required-param short-circuit behavior unchanged for all other tools/errors.
|
||||
- Added tests covering the specific detection helper and end-to-end streaming behavior for the missing-location clarify path.
|
||||
|
||||
## Follow-up fix: itinerary-aware search fallback
|
||||
- Feature: Use collection itinerary context to satisfy missing `search_places.location` when available, and preserve dining intent when category is omitted.
|
||||
- Acceptance:
|
||||
- Restaurant requests across a collection itinerary should not ask for location when itinerary context can provide one.
|
||||
- `search_places` should not silently return tourism attractions for restaurant requests when category is omitted.
|
||||
- Existing behavior for non-dining searches remains intact.
|
||||
- Planned worktree: `.worktrees/chat-search-intent-fix`
|
||||
- Planned branch: `fix/chat-search-intent-fix`
|
||||
|
||||
## Implementation note: chat search intent + context retry
|
||||
- Implemented deterministic `search_places` retry fallback in `ChatViewSet`: when tool call misses required `location`, retry first with trip context (`destination`, then first itinerary stop label), then retain existing user-location-reply retry behavior.
|
||||
- Added deterministic dining-intent heuristic in `ChatViewSet` to inject `category="food"` when `search_places.category` is omitted and current/recent user messages indicate restaurant/dining intent.
|
||||
- Updated `search_places` tool schema text to explicitly describe `food` (restaurants/dining), `tourism` (attractions), and `lodging` (hotels/stays).
|
||||
- Added targeted tests for category inference and a collection-context integration flow proving restaurant requests can complete via destination-backed retry without location clarification prompt.
|
||||
|
||||
## Regression: Duplicate Tool Summaries / Repeated Items — Frontend Analysis (2026-03-10)
|
||||
|
||||
### Context
|
||||
Exploring why restaurant recommendations across a test itinerary produce duplicated itinerary-item summaries and repeated warning toasts in the frontend chat renderer.
|
||||
|
||||
### Root Cause 1 — `rebuildConversationMessages` double-accumulates `tool_results` on reload
|
||||
|
||||
**File:** `frontend/src/lib/components/AITravelChat.svelte`, `rebuildConversationMessages()` (lines 333–375)
|
||||
|
||||
**Mechanism:** When `selectConversation()` reloads a conversation from the API, `rebuildConversationMessages()` walks the raw message list and pushes each `role=tool` message into its parent assistant message's `tool_results` array. The backend serializer may return a pre-populated `tool_results` field on the assistant message object. Because the `rebuilt` map at line 336 preserves any server-side `tool_results` via spread (`msg.tool_results ? [...msg.tool_results] : undefined`), and then the walk loop **appends again** from the raw `role=tool` sibling messages, any tool result present on both the server-side `assistant.tool_results` field *and* as a sibling `role=tool` message will be pushed **twice**.
|
||||
|
||||
**Second trigger path:** A multi-stop itinerary causes the backend tool loop to fire `search_places` once per stop via the context-retry fallback. Each call emits a separate `tool_result` SSE event. The frontend appends each into `assistantMsg.tool_results` during streaming (line 472). On conversation reload, `rebuildConversationMessages` re-appends them again from the persisted `role=tool` DB rows → **2× the place-result cards**.
|
||||
|
||||
**Fix approach:** Change line 336 so the initial `rebuilt` map always sets `tool_results: undefined`. This discards any server-side `tool_results` field on assistant messages and lets the sibling-walk be the sole source of truth:
|
||||
```ts
|
||||
// line 334–337 — change to:
|
||||
const rebuilt = rawMessages.map((msg) => ({
|
||||
...msg,
|
||||
tool_results: undefined // always re-derive from role=tool siblings below
|
||||
}));
|
||||
```
|
||||
|
||||
### Root Cause 2 — Multi-iteration tool loop creates multiple assistant messages; `rebuildConversationMessages` may pair results to wrong parent
|
||||
|
||||
**File:** `AITravelChat.svelte`, `rebuildConversationMessages()` lines 342–344 and backend `event_stream` lines 662–688
|
||||
|
||||
**Mechanism:** For a multi-restaurant flow the backend iterates: `search_places` (iteration 1) → saves assistant+tool messages → then `add_to_itinerary × N` (iterations 2–N). Each iteration saves a separate assistant message with distinct `tool_calls`. On reload, the loop in `rebuildConversationMessages` sets `activeAssistant` to each assistant message with `tool_calls`, then nulls it when `tool_results.length >= toolCallIds.length` (lines 366–371). If the DB `created_at` timestamps for assistant and tool messages are identical (same DB transaction), the `order_by("created_at")` in `_build_llm_messages` is non-deterministic, and the sibling walk may misalign tool results to wrong parent assistant messages → wrong or duplicated summary cards.
|
||||
|
||||
**Fix approach (backend):** After saving the assistant message at line 670 with `await sync_to_async(ChatMessage.objects.create)`, save each tool message **in a separate awaited call** (already the case at lines 678–682) and add a `time.sleep(0)` / tick between saves to guarantee distinct `created_at`. Alternatively, add an explicit `order` integer field to `ChatMessage` and sort by it instead of `created_at`.
|
||||
|
||||
### Root Cause 3 — Repeated "success" toast from duplicate place-result cards
|
||||
|
||||
**File:** `AITravelChat.svelte`, `addPlaceToItinerary()` lines 569–632
|
||||
|
||||
**Mechanism:** `addToast('success', $t('added_successfully'))` fires unconditionally at line 626 on every user-triggered add. When Root Cause 1 causes duplicate place-result cards to be rendered, the user sees two "Add to itinerary" buttons for the same place and clicking either fires a toast. Fixing Root Cause 1 (dedup in `rebuildConversationMessages`) eliminates the duplicate cards and therefore this repeat toast.
|
||||
|
||||
**No action required here unless RC1 fix is insufficient.** As a defense-in-depth measure, consider tracking which place names have been added within the session and disabling the button.
|
||||
|
||||
### Concrete Edit Points
|
||||
|
||||
| File | Function/Section | Change |
|
||||
|---|---|---|
|
||||
| `AITravelChat.svelte` L334–336 | `rebuilt` map init in `rebuildConversationMessages` | Change `tool_results: msg.tool_results ? [...msg.tool_results] : undefined` → `tool_results: undefined` (always re-derive from sibling walk) |
|
||||
| `AITravelChat.svelte` L467–474 | SSE `parsed.tool_result` handler in `sendMessage` | Add dedup guard: skip push if `tool_call_id` already in `assistantMsg.tool_results` |
|
||||
| `AITravelChat.svelte` L894–944 | `{#each msg.tool_results as result}` template | Add JS-side dedup by `tool_call_id` or a `{#key}` block before rendering |
|
||||
| `backend/server/chat/views/__init__.py` L670–682 | assistant + tool message save in `event_stream` | Verify saved order is deterministic; add explicit sequence/order field if `created_at` collisions are possible |
|
||||
|
||||
### Recommended Fix Sequence
|
||||
1. **Priority 1:** `AITravelChat.svelte` line 336 — change to `tool_results: undefined`. Single-line fix, eliminates Root Cause 1 and RC3 transitively.
|
||||
2. **Priority 2:** `AITravelChat.svelte` SSE handler (line 472) — add `tool_call_id` dedup guard for defense against backend emitting duplicate events.
|
||||
3. **Priority 3:** Template dedup (lines 894–944) as render-layer safety net.
|
||||
4. **Verify:** Reload a multi-restaurant conversation; each `add_to_itinerary` summary and each place card must appear exactly once.
|
||||
|
||||
### Related Notes
|
||||
- [[assistant-add-flow-fixes]] — backend search intent + retry logic that triggers multi-iteration tool loops
|
||||
|
||||
## Follow-on fix: restaurant recommendation tool loop
|
||||
- [ ] Workstream `fix/chat-tool-loop-fix` in `.worktrees/chat-tool-loop-fix`: stop chat tool loops from replaying execution failures into the LLM context; only count successful tool iterations, cap repeated all-failure rounds, and hard-stop permanent `web_search` failures.
|
||||
- Acceptance: restaurant recommendation requests do not end with "too many tool calls" when tool executions fail; `search_places`/`web_search` failures surface once and stop retry spirals.
|
||||
- [ ] Keep frontend tool-result rendering deduplicated in `AITravelChat.svelte` during both SSE streaming and conversation rebuild.
|
||||
- Acceptance: itinerary/tool summaries do not render duplicate cards or duplicate downstream add-to-itinerary affordances after reload.
|
||||
- [ ] Validate the fix in worktree `fix/chat-tool-loop-fix` with targeted backend chat tests plus frontend `bun run check`/`bun run build`, then perform a functional restaurant-request verification.
|
||||
- Acceptance: validation demonstrates the assistant either returns restaurant guidance or surfaces a single actionable failure instead of looping.
|
||||
|
||||
## Implementation outcome: chat tool loop + duplicate summary fix
|
||||
- Added execution-failure classification in `backend/server/chat/views/__init__.py` that separates required-parameter validation errors from execution failures, and excludes execution-failure tool rows from LLM replay/history filtering.
|
||||
- Updated tool loop control so `tool_iterations` increments only after at least one successful tool call; added bounded `MAX_ALL_FAILURE_ROUNDS = 3` for all-failure rounds, with immediate stop for permanent failures (`retryable: false`).
|
||||
- Extended `search_places` location retry candidate detection to include geocoding failures (`Could not geocode location: ...`) so destination/itinerary/user-location fallback can retry before terminating.
|
||||
- Marked `web_search` dependency import failure as permanent in `backend/server/chat/agent_tools.py` (`retryable: false`) so it is not retried like transient outages.
|
||||
- Fixed frontend duplication in `frontend/src/lib/components/AITravelChat.svelte` by rebuilding tool results only from persisted `role=tool` rows, deduping on `tool_call_id` during rebuild and live SSE ingestion, and rendering deduped tool results.
|
||||
- Added focused backend tests in `backend/server/chat/tests.py` for execution-failure classification, geocode retry fallback, bounded all-failure loop behavior, permanent failure immediate stop, and `web_search` import-failure retryability flag.
|
||||
- Related prior analysis note: [[assistant-add-flow-fixes]]
|
||||
|
||||
## Security Review Verdict: chat-tool-loop-fix (2026-03-10)
|
||||
|
||||
- **Verdict:** CHANGES-REQUESTED
|
||||
- **Review Score:** 6 (2 WARNINGs × 3)
|
||||
- **Lens:** Security
|
||||
- **WARNING 1:** `_build_tool_execution_error_event` (views/__init__.py:260-272) forwards raw tool error text to the client SSE stream. The error string from `execute_tool` (agent_tools.py:643) uses `str(exc)`, which can contain internal details for uncaught exceptions. This is also persisted as assistant message content.
|
||||
- **WARNING 2:** Persisted error text in assistant messages (L769) creates a durable exposure vector — raw error details survive in DB and are re-served on reload.
|
||||
- **No CRITICALs.** Auth/permission checks, injection safety, and CSRF handling remain intact.
|
||||
- **Fix required:** Sanitize error text in `_build_tool_execution_error_event` or at source in `execute_tool` catch-all. Targeted re-review sufficient after fix.
|
||||
- **Confirmed:** `_safe_error_payload()` convention not violated by existing code, but new path bypasses it.
|
||||
- Related: [[assistant-add-flow-fixes]]
|
||||
|
||||
## Review verdict: chat-tool-loop-fix (2026-03-10)
|
||||
|
||||
**VERDICT: APPROVED**
|
||||
**LENS: correctness**
|
||||
**REVIEW_SCORE: 0**
|
||||
|
||||
### What was checked
|
||||
|
||||
1. **Execution failure classification** (`_is_execution_failure_tool_error`, `_is_retryable_execution_failure`): Verified the classification is correctly the complement of `_is_required_param_tool_error` — any `error`-bearing dict that doesn't match the required-param pattern is an execution failure. The `retryable` flag defaults to `True` unless explicitly `False`. No false-positive/false-negative classification risk found.
|
||||
|
||||
2. **Execution failure exclusion from LLM context**: Execution failure tool results are never added to `successful_tool_calls`/`successful_tool_messages`/`successful_tool_chat_entries` (skipped via `continue` at line 718). They are never persisted to DB and never appended to `current_messages`. The `_build_llm_messages` filter at lines 65-67 and 84-88 is a belt-and-suspenders defense for pre-existing DB data.
|
||||
|
||||
3. **All-failure-round bounding**: `all_failure_rounds` increments by 1 on each all-failure round and is hard-set to MAX on permanent failures. The outer `while` loop guard (`tool_iterations < MAX_TOOL_ITERATIONS`) remains satisfied during all-failure rounds (since `tool_iterations` isn't incremented), but `all_failure_rounds >= MAX_ALL_FAILURE_ROUNDS` (3) exits within bounded iterations. No infinite loop possible.
|
||||
|
||||
4. **Permanent failure fast-path**: `encountered_permanent_failure = True` → `all_failure_rounds = MAX_ALL_FAILURE_ROUNDS` → immediate stop. Confirmed `web_search` ImportError returns `retryable: False`.
|
||||
|
||||
5. **Geocode failure retry**: `_is_search_places_location_retry_candidate_error` now matches both "location is required" and "Could not geocode location: ..." patterns. Retry uses trip context destination or user content. Retry success check (lines 631-635) now requires the retry to be free of BOTH required-param AND execution-failure errors — correct tightening vs main branch.
|
||||
|
||||
6. **Frontend tool-result dedup**: Three layers of defense confirmed:
|
||||
- `rebuildConversationMessages` now sets `tool_results: undefined` (line 374), eliminating RC1 (double-accumulation from server-side pre-populated `tool_results`).
|
||||
- `appendToolResultDedup` deduplicates by `tool_call_id` during both rebuild (line 402) and SSE streaming (line 514).
|
||||
- `uniqueToolResultsByCallId` at render time (line 939) provides a final safety net.
|
||||
- Edge case: tool results without `tool_call_id` bypass dedup — acceptable since real tool calls always have IDs.
|
||||
|
||||
7. **Test coverage**: Tests cover execution failure classification, geocode retry, bounded all-failure loop (verifying 3 LLM calls + 3 tool calls), permanent failure immediate stop (verifying 1 LLM call + 1 tool call), and `web_search` ImportError retryability flag. Mock setup is correct — `side_effect = failing_stream` (callable) creates fresh async generators per call.
|
||||
|
||||
### No findings
|
||||
|
||||
- No off-by-one in loop bounds.
|
||||
- No async/await misuse — `sync_to_async` wraps synchronous `execute_tool` correctly.
|
||||
- No resource leaks — generators terminate cleanly via `return` or `break`.
|
||||
- No race conditions — the event stream is single-threaded.
|
||||
- No contract violations — SSE event format is preserved for both success and error paths.
|
||||
- No mutation/shared-state risks — `first_execution_failure` and `encountered_permanent_failure` are scoped per tool-call batch.
|
||||
|
||||
## Security fix outcome: execute_tool error sanitization
|
||||
- Updated `backend/server/chat/agent_tools.py` so `execute_tool()` catch-all no longer returns raw `str(exc)`.
|
||||
- New catch-all payload is now the safe generic message: `{"error": "Tool execution failed"}`.
|
||||
- Added focused regression test in `backend/server/chat/tests.py` (`ExecuteToolErrorSanitizationTests`) to verify exception text is not exposed via `execute_tool()`.
|
||||
|
||||
## Security Re-Review Verdict: execute_tool sanitization fix (2026-03-10)
|
||||
|
||||
- **Verdict:** APPROVED
|
||||
- **Review Score:** 0
|
||||
- **Lens:** Security (targeted re-review)
|
||||
- **Resolution:** Both prior WARNINGs (SSE raw error leak, DB persistence of raw errors) are RESOLVED.
|
||||
- **Evidence:**
|
||||
- `execute_tool()` catch-all at `agent_tools.py:643` now returns `{"error": "Tool execution failed"}` (hardcoded string, no `str(exc)`).
|
||||
- `_build_tool_execution_error_event()` at `views/__init__.py:260-272` wraps the sanitized error in a user-safe sentence.
|
||||
- SSE emission at `views/__init__.py:775` and DB persistence at `views/__init__.py:769` both receive only the wrapped safe text.
|
||||
- Regression test `ExecuteToolErrorSanitizationTests` at `tests.py:49-65` validates sanitization by injecting `RuntimeError("sensitive backend detail")` and asserting the generic output.
|
||||
- **No new issues introduced.** Single-line source-level fix with no new code paths or trust boundaries.
|
||||
- Related: [[assistant-add-flow-fixes]]
|
||||
|
||||
## Outcome: restaurant recommendation tool loop fix (2026-03-10)
|
||||
- [x] Workstream `fix/chat-tool-loop-fix` in `.worktrees/chat-tool-loop-fix`: chat tool execution failures no longer replay into LLM context as successful tool results; successful-tool iteration budget is separate from all-failure rounds; permanent failures stop immediately.
|
||||
- [x] Frontend `AITravelChat.svelte` dedupes tool results on rebuild, SSE ingestion, and render.
|
||||
- [x] Validation passed: backend targeted chat suites passed after syncing worktree files into `voyage-server-1:/code/chat/...`; frontend `bun run check` and `bun run build` passed with only pre-existing warnings.
|
||||
- [x] Review passed after sanitizing `execute_tool()` catch-all errors to `"Tool execution failed"`.
|
||||
|
||||
Open risk accepted for pre-release:
|
||||
- Mixed batches with both successful and failed tool calls can still drop the failed tool silently and rely on `MAX_TOOL_ITERATIONS` as the backstop; adversarial tester marked this low severity and acceptable for pre-release.
|
||||
|
||||
## Implementation outcome: backend context-location extraction + clarification suppression
|
||||
- Updated `backend/server/chat/views/__init__.py` itinerary-stop fallback parsing so address-only stops derive a city hint from the last non-numeric comma-separated segment (e.g. `Little Turnstile 6, London` → `London`) before computing `trip_context_location`.
|
||||
- Updated `search_places` context-retry block to track `attempted_location_retry`; when retries were attempted but still return required-param errors, result is converted to execution failure (`Could not search places at the provided itinerary locations`) to avoid location clarification prompts.
|
||||
- Clarification branch now only triggers for missing-location required-param errors when no context retry was attempted.
|
||||
- Added tests in `backend/server/chat/tests.py` for city extraction from fallback address and for retry-with-context failure path asserting no clarification SSE is emitted and execution-failure behavior is returned.
|
||||
|
||||
## Targeted Re-Review Verdict: city extraction + clarification suppression (2026-03-10)
|
||||
|
||||
- **Verdict:** APPROVED
|
||||
- **Review Score:** 0
|
||||
- **Lens:** Correctness (targeted re-review of Fix 1 + Fix 2 only)
|
||||
- **Scope:** `views/__init__.py` lines 444-462 (city extraction from fallback address) and lines 611-671 + 694-700 (retry failure → execution failure mutation + clarification guard). Plus two new tests in `tests.py`.
|
||||
|
||||
### Fix 1: City extraction — edge cases verified
|
||||
- Comma address: extracts last non-empty non-numeric segment (correct)
|
||||
- No comma: falls back to raw `fallback_name` (correct)
|
||||
- All-numeric segments: falls back to full `fallback_name` (safe, will fail geocoding gracefully)
|
||||
- Empty parts after split: filtered by truthiness check (correct)
|
||||
- Dedup via `stop_key`: uses extracted city, correctly collapses same-city addresses
|
||||
|
||||
### Fix 2: Retry failure mutation — flow verified
|
||||
- `attempted_location_retry` is per-tool-call scoped (no cross-call bleed)
|
||||
- Mutated result `"Could not search places at the provided itinerary locations"` correctly fails `_is_required_param_tool_error` (no match on exact strings or regex)
|
||||
- Mutated result correctly passes `_is_execution_failure_tool_error` (dict with non-empty error, not required-param)
|
||||
- Routes to execution-failure handler, not clarification branch
|
||||
- Clarification guard at lines 694-700 is defense-in-depth (unreachable in mutation scenario, but protects against future refactoring)
|
||||
|
||||
### Test coverage verified
|
||||
- `test_collection_context_retry_extracts_city_from_fallback_address`: correctly tests city extraction from `"Little Turnstile 6, London"` → `"London"` retry
|
||||
- `test_context_retry_failure_does_not_emit_location_clarification`: correctly tests mutation routing to `tool_execution_error` SSE event, no clarification
|
||||
- Both tests exercise the intended code paths with correct mock setups
|
||||
|
||||
### No findings — APPROVED with zero issues
|
||||
- Related: [[assistant-add-flow-fixes]]
|
||||
Reference in New Issue
Block a user