voyage/.memory/research/auto-learn-preference-signals.md

---
title: auto-learn-preference-signals
type: note
permalink: voyage/research/auto-learn-preference-signals
---

# Research: Auto-Learn User Preference Signals

## Purpose
Map all existing user data that could be aggregated into an automatic preference profile, without requiring manual input.

## Signal Inventory

### 1. Location.category (FK → Category)
- **Model**: `adventures/models.py:Category` — per-user custom categories (name, display_name, icon)
- **Signal**: Top categories by count → dominant interest type (e.g. "hiking", "dining", "cultural")
- **Query**: `Location.objects.filter(user=user).values('category__name').annotate(cnt=Count('id')).order_by('-cnt')`
- **Strength**: HIGH — user-created categories are deliberate choices

### 2. Location.tags (ArrayField)
- **Model**: `adventures/models.py:Location.tags` — `ArrayField(CharField(max_length=100))`
- **Signal**: Most frequent tags across all user locations → interest keywords
- **Query**: `Location.objects.filter(user=user).values_list('tags', flat=True).distinct()` (used in `tags_view.py`)
- **Strength**: MEDIUM-HIGH — tags are free-text user input

### 3. Location.rating (FloatField)
- **Model**: `adventures/models.py:Location.rating`
- **Signal**: Average rating + high-rated locations → positive sentiment for place types; filtering for visited + high-rated → strong preferences
- **Query**: `Location.objects.filter(user=user).aggregate(avg_rating=Avg('rating'))` or breakdown by category
- **Strength**: HIGH for positive signals (≥4.0); weak if rarely filled in

### 4. Location.description / Visit.notes (TextField)
- **Model**: `adventures/models.py:Location.description`, `Visit.notes`
- **Signal**: Free-text content for NLP keyword extraction (budget, adventure, luxury, cuisine words)
- **Query**: `Location.objects.filter(user=user).values_list('description', flat=True)`
- **Strength**: LOW (requires NLP to extract structured signals; many fields blank)

### 5. Lodging.type (LODGING_TYPES enum)
- **Model**: `adventures/models.py:Lodging.type` — choices: hotel, hostel, resort, bnb, campground, cabin, apartment, house, villa, motel
- **Signal**: Most frequently used lodging type → travel style indicator (e.g. "hostel" → budget; "resort/villa" → luxury; "campground/cabin" → outdoor)
- **Query**: `Lodging.objects.filter(user=user).values('type').annotate(cnt=Count('id')).order_by('-cnt')`
- **Strength**: HIGH — directly maps to trip_style field

### 6. Lodging.rating (FloatField)
- **Signal**: Combined with lodging type, identifies preferred accommodation standards
- **Strength**: MEDIUM

### 7. Transportation.type (TRANSPORTATION_TYPES enum)
- **Model**: `adventures/models.py:Transportation.type` — choices: car, plane, train, bus, boat, bike, walking
- **Signal**: Primary transport mode → mobility preference (e.g. mostly walking/bike → slow travel; lots of planes → frequent flyer)
- **Query**: `Transportation.objects.filter(user=user).values('type').annotate(cnt=Count('id')).order_by('-cnt')`
- **Strength**: MEDIUM

### 8. Activity.sport_type (SPORT_TYPE_CHOICES)
- **Model**: `adventures/models.py:Activity.sport_type` — 60+ choices mapped to 10 SPORT_CATEGORIES in `utils/sports_types.py`
- **Signal**: Activity categories user is active in → physical/adventure interests
- **Categories**: running, walking_hiking, cycling, water_sports, winter_sports, fitness_gym, racket_sports, climbing_adventure, team_sports
- **Query**: Already aggregated in `stats_view.py:_get_activity_stats_by_category()` — uses `Activity.objects.filter(user=user).values('sport_type').annotate(count=Count('id'))`
- **Strength**: HIGH — objective behavioral data from Strava/Wanderer imports

### 9. VisitedRegion / VisitedCity (worldtravel)
- **Model**: `worldtravel/models.py` — `VisitedRegion(user, region)` and `VisitedCity(user, city)` with country/subregion
- **Signal**: Countries/regions visited → geographic preferences (beach vs. mountain vs. city; EU vs. Asia etc.)
- **Query**: `VisitedRegion.objects.filter(user=user).select_related('region__country')` → country distribution
- **Strength**: MEDIUM-HIGH — "where has this user historically traveled?" informs destination type

### 10. Collection metadata
- **Model**: `adventures/models.py:Collection` — name, description, start/end dates
- **Signal**: Collection names/descriptions may contain destination/theme hints; trip duration (end_date − start_date) → travel pace; trip frequency (count, spacing) → travel cadence
- **Query**: `Collection.objects.filter(user=user).values('name', 'description', 'start_date', 'end_date')`
- **Strength**: LOW-MEDIUM (descriptions often blank; names are free-text)

### 11. Location.price / Lodging.price (MoneyField)
- **Signal**: Average spend across locations/lodging → budget tier
- **Query**: `Location.objects.filter(user=user).aggregate(avg_price=Avg('price'))` (requires djmoney amount field)
- **Strength**: MEDIUM — but many records may have no price set

### 12. Location geographic clustering (lat/lon)
- **Signal**: Country/region distribution of visited locations → geographic affinity
- **Already tracked**: `Location.country`, `Location.region`, `Location.city` (FK, auto-geocoded)
- **Query**: `Location.objects.filter(user=user).values('country__name').annotate(cnt=Count('id')).order_by('-cnt')`
- **Strength**: HIGH

### 13. UserAchievement types
- **Model**: `achievements/models.py:UserAchievement` — types: `adventure_count`, `country_count`
- **Signal**: Milestone count → engagement level (casual vs. power user); high `country_count` → variety-seeker
- **Strength**: LOW-MEDIUM (only 2 types currently)

### 14. ChatMessage content (user role)
- **Model**: `chat/models.py:ChatMessage` — `role`, `content`
- **Signal**: User messages in travel conversations → intent signals ("I love hiking", "looking for cheap food", "family-friendly")
- **Query**: `ChatMessage.objects.filter(conversation__user=user, role='user').values_list('content', flat=True)`
- **Strength**: MEDIUM — requires NLP; could be rich but noisy

## Aggregation Patterns Already in Codebase

| Pattern | Location | Reusability |
|---|---|---|
| Activity stats by category | `stats_view.py:_get_activity_stats_by_category()` | Direct reuse |
| All-tags union | `tags_view.py:ActivityTypesView.types()` | Direct reuse |
| VisitedRegion/City counts | `stats_view.py:counts()` | Direct reuse |
| Multi-user preference merge | `llm_client.py:get_aggregated_preferences()` | Partial reuse |
| Category-filtered location count | `serializers.py:location_count` | Pattern reference |
| Location queryset scoping | `location_view.py:get_queryset()` | Standard pattern |

## Proposed Auto-Profile Fields from Signals

| Target Field | Primary Signals | Secondary Signals |
|---|---|---|
| `cuisines` | Location.tags (cuisine words), Location.category (dining) | Location.description NLP |
| `interests` | Activity.sport_type categories, Location.category top-N | Location.tags frequency, VisitedRegion types |
| `trip_style` | Lodging.type top (luxury/budget/outdoor), Transportation.type, Activity sport categories | Location.rating Avg, price signals |
| `notes` | (not auto-derived — keep manual only) | — |

## Where to Implement

**New function target**: `integrations/views/recommendation_profile_view.py` or a new `integrations/utils/auto_profile.py`

**Suggested function signature**:
```python
def build_auto_preference_profile(user) -> dict:
    """
    Returns {cuisines, interests, trip_style} inferred from user's travel history.
    Fields are non-destructive suggestions, not overrides of manual input.
    """
```

**New API endpoint target**: `POST /api/integrations/recommendation-preferences/auto-learn/`
**ViewSet action**: `@action(detail=False, methods=['post'], url_path='auto-learn')` on `UserRecommendationPreferenceProfileViewSet`

## Integration Point
`get_system_prompt()` in `chat/llm_client.py` already consumes `UserRecommendationPreferenceProfile` — auto-learned values
flow directly into AI context with zero additional changes needed there.

See: [knowledge.md — User Recommendation Preference Profile](../knowledge.md#user-recommendation-preference-profile)
See: [plans/ai-travel-agent-redesign.md — WS2](../plans/ai-travel-agent-redesign.md#ws2-user-preference-learning)