dotfiles/.config/opencode/AGENTS.md

## Memory System (Dual: Global + Per-Project)

Memory is split into two complementary systems:

1. **Global Memory (basic-memory)** — cross-project knowledge via MCP server. Stores reusable patterns, conventions, tech knowledge, domain concepts, and lessons learned. Lives in `~/basic-memory/`, accessed via MCP tools.
2. **Project Memory (`.memory/`)** — per-project state committed to git. Stores plans, gates, sessions, project-specific decisions, research, and architecture. Lives in the project's `.memory/` directory.

### Global Memory (basic-memory)

[basic-memory](https://github.com/basicmachines-co/basic-memory) is an MCP server that provides persistent knowledge through structured markdown files indexed in SQLite with semantic search.

**What goes in global memory:**
- Reusable coding patterns (error handling, testing, logging)
- Technology knowledge (how libraries/frameworks/tools work)
- Convention preferences (coding style decisions that span projects)
- Domain concepts that apply across projects
- Cross-project lessons learned and retrospectives
- SME guidance that isn't project-specific

**MCP tools (available to all agents):**
- `write_note(title, content, folder, tags)` — create/update a knowledge note
- `read_note(identifier)` — read a specific note by title or permalink
- `search_notes(query)` — semantic + full-text search across all notes
- `build_context(url, depth)` — follow knowledge graph relations for deep context
- `recent_activity(type)` — find recently added/updated notes

**Note format:**
```markdown
---
title: Go Error Handling Patterns
permalink: go-error-handling-patterns
tags:
- go
- patterns
- error-handling
---
# Go Error Handling Patterns

## Observations
- [pattern] Use sentinel errors for expected error conditions #go
- [convention] Wrap errors with fmt.Errorf("context: %w", err) #go

## Relations
- related_to [[Go Testing Patterns]]
```

**Usage rules:**
- At session start, use `search_notes` or `build_context` to find relevant global knowledge before starting work.
- After completing work with reusable lessons, use `write_note` to record them globally.
- Use WikiLinks `[[Topic]]` to create relations between notes.
- Use tags for categorization: `#pattern`, `#convention`, `#sme`, `#lesson`, etc.
- Use observation categories: `[pattern]`, `[convention]`, `[decision]`, `[lesson]`, `[risk]`, `[tool]`.

### Project Memory (`.memory/`)

Per-project state, committed to git. This is the source of truth for active project work.

**Directory structure:**

```text
.memory/
├── manifest.yaml           # Index: all files with descriptions + groups
├── system.md               # One-paragraph project overview
│
├── knowledge/              # Project-specific knowledge
│   ├── overview.md         # THIS project's architecture
│   └── tech-stack.md       # THIS project's technologies
│
├── decisions.md            # Project-specific Architecture Decision Records
│
├── plans/                  # Active plans (one per feature)
│   └── <feature>.md
│
├── research/               # Project-specific research findings
│   └── <topic>.md
│
├── gates/                  # Quality gate records
│   └── <feature>.md        # Review + test outcomes
│
└── sessions/               # Session continuity
    └── continuity.md       # Rolling session notes
```

**Workflow: global context → project context → work → update both**

1. **Session start:** Query basic-memory (`search_notes`) for relevant global context, then read `.memory/manifest.yaml` and relevant project files.
2. **Before each task:** Check global memory for reusable patterns/guidance, then read relevant `.memory/` files for project state.
3. **After each task:** Update `.memory/` for project state. If the work produced reusable knowledge, also `write_note` to basic-memory.
4. **Quality gates:** Record reviewer/tester outcomes in `.memory/gates/<feature>.md` (project-specific).

**Manifest schema:**

```yaml
name: <project-name>
version: 1
categories:
  - path: <relative-path>
    description: <one-line description>
    group: <knowledge|decisions|plans|research|gates|sessions>
```

**Recording discipline:** Only record outcomes, decisions, and discoveries — never phase transitions, status changes, or ceremony checkpoints. If an entry would only say "we started phase X", don't add it. Memory files preserve *knowledge*, not *activity logs*.

**Read discipline:**
- Read `manifest.yaml` first to discover what's available
- Read only the `.memory/` files relevant to the current task
- **Skip redundant reads** when `.memory/` already has no relevant content in that domain this session
- **Do not immediately re-read content you just wrote**
- Treat `.memory/` as a **tool**, not a ritual

**Linking is required.** When recording related knowledge across files, add markdown cross-references (for example: `See [Decision: Auth](decisions.md#auth-approach)`). Cross-reference global memory notes with `memory://permalink` URLs when relevant.

**Manifest maintenance:** When creating new `.memory/` files, add entries to `manifest.yaml` with path, description, and group. The librarian agent verifies manifest accuracy.

### When to Use Which

| Knowledge type | Where to store | Why |
|---|---|---|
| Reusable pattern/convention | basic-memory (`write_note`) | Benefits all projects |
| SME guidance (general) | basic-memory (`write_note`) | Reusable across consultations |
| Project architecture | `.memory/knowledge/` | Specific to this project |
| Active plans & gates | `.memory/plans/`, `.memory/gates/` | Project lifecycle state |
| Session continuity | `.memory/sessions/` | Project-scoped session tracking |
| Project decisions (ADRs) | `.memory/decisions.md` | Specific to this project |
| Project research | `.memory/research/` | Tied to project context |
| Tech knowledge (general) | basic-memory (`write_note`) | Reusable reference |
| Lessons learned | basic-memory (`write_note`) | Cross-project value |

## Cross-Tool Instruction Files

Use symlinks to share this instruction file across all agentic coding tools:

```text
project/
├── AGENTS.md                          # Real file (edit this one)
├── CLAUDE.md -> AGENTS.md
├── .cursorrules -> AGENTS.md
└── .github/
    └── copilot-instructions.md -> ../AGENTS.md
```

**Rules:**
- Edit `AGENTS.md` — changes propagate automatically via symlinks
- Never edit symlinked files directly (changes would be lost)
- Symlinks are committed to git (git tracks them natively)

**Content of this file:**
- Project overview and purpose
- Tech stack and architecture
- Coding conventions and patterns
- Build/test/lint commands
- Project structure overview

**Do NOT duplicate `.memory/` contents** — instruction files describe how to work with the project, not active plans, research, or decisions.

**When initializing a project:**
1. Create `AGENTS.md` with project basics
2. Create symlinks: `ln -s AGENTS.md CLAUDE.md`, `ln -s AGENTS.md .cursorrules`, `ln -s ../AGENTS.md .github/copilot-instructions.md`
3. Commit the real file and symlinks to git

**When joining an existing project:**
- Read `AGENTS.md` (or any of the symlinked files) to understand the project
- If instruction file is missing, create it and the symlinks

## Session Continuity

- Treat `.memory/` files as the persistent project tracking system for work across sessions.
- At session start, query basic-memory (`search_notes`) for relevant cross-project knowledge, then identify prior in-progress work items and pending decisions in `.memory/` before doing new implementation.
- After implementation, update `.memory/` files with what changed, why it changed, and what remains next.
- If the work produced reusable knowledge (patterns, conventions, lessons learned), also record it in basic-memory (`write_note`) for cross-project benefit.

## Clarification Rule

- If requirements are genuinely unclear, materially ambiguous, or have multiple valid interpretations that would lead to **materially different implementations**, use the `question` tool to clarify before committing to an implementation path.
- **Do not ask for clarification when the user's intent is obvious.** If the user explicitly states what they want (e.g., "update X and also update Y"), do not ask "should I do both?" — proceed with the stated request.
- Implementation-level decisions (naming, file organization, approach) are the agent's job, not the user's. Only escalate decisions that affect **user-visible behavior or scope**.

## Agent Roster

| Agent | Role | Model |
|---|---|---|
| `lead` | Primary orchestrator that decomposes work, delegates, and synthesizes outcomes. | `github-copilot/claude-opus-4` (global default) |
| `coder` | Implementation-focused coding agent for reliable code changes. | `github-copilot/gpt-5.3-codex` |
| `reviewer` | Read-only code/source review; writes `.memory/*` for verdict records. | `github-copilot/claude-opus-4.6` |
| `tester` | Validation agent for standard + adversarial testing; writes `.memory/*` for test outcomes. | `github-copilot/claude-sonnet-4.6` |
| `explorer` | Fast read-only codebase mapper; writes `.memory/*` for discovery records. | `github-copilot/claude-sonnet-4.6` |
| `researcher` | Deep technical investigator; writes `.memory/*` for research findings. | `github-copilot/claude-opus-4.6` |
| `librarian` | Documentation coverage and accuracy specialist. | `github-copilot/claude-opus-4.6` |
| `critic` | Pre-implementation gate and blocker sounding board; writes `.memory/*` for verdicts. | `github-copilot/claude-opus-4.6` |
| `sme` | Subject-matter expert for domain-specific consultation; writes `.memory/*` for guidance cache. | `github-copilot/claude-opus-4.6` |
| `designer` | UI/UX specialist for interaction and visual guidance; writes `.memory/*` for design decisions. | `github-copilot/claude-sonnet-4.6` |

All agents except `lead`, `coder`, and `librarian` are code/source read-only but have `permission.edit: allow` scoped to `.memory/*` writes for their recording duties. The `lead` and `librarian` have full edit access; `coder` has full edit access for implementation.

## Parallelization

- **Always parallelize independent work.** Any tool calls that do not depend on each other's output must be issued in the same message as parallel calls — never sequentially. This applies to bash commands, file reads, and subagent delegations alike.
- Before issuing a sequence of calls, ask: *"Does call B require the result of call A?"* If not, send them together.

## Human Checkpoint Triggers

When implementing features, the Lead must stop and request explicit user approval before dispatching coder work in these situations:

1. **Security-sensitive design**: Any feature involving encryption, auth flows, secret storage, token management, or permission model changes.
2. **Architectural ambiguity**: Multiple valid approaches with materially different tradeoffs that aren't resolvable from codebase conventions alone.
3. **Vision-dependent features**: Features where the user's intended UX or behavior model isn't fully specified by the request.
4. **New external dependencies**: Adding a service, SDK, or infrastructure component not already in the project.
5. **Data model changes with migration impact**: Schema changes affecting existing production data.

The checkpoint must present the specific decision, 2-3 concrete options with tradeoffs, a recommendation, and a safe default. Implementation-level decisions (naming, file organization, code patterns) are NOT checkpoints — only user-visible behavior and architectural choices qualify.

## Functional Verification (Implement → Verify → Iterate)

**Static analysis is not verification.** Type checks (`bun run check`, `tsc`), linters (`eslint`, `ruff`), and framework system checks (`python manage.py check`) confirm code is syntactically and structurally valid. They do NOT confirm the feature works. A feature that type-checks perfectly can be completely non-functional.

**Every implemented feature MUST be functionally verified before being marked complete.** "Functionally verified" means demonstrating that the feature actually works end-to-end — not just that it compiles.

### What Counts as Functional Verification

Functional verification must exercise the **actual behavior path** a user would trigger:

- **API endpoints**: Make real HTTP requests (`curl`, `httpie`, or the app's test client) and verify response status, shape, and data correctness. Check both success and error paths.
- **Frontend components**: Verify the component renders, interacts correctly, and communicates with the backend. Use the browser (Playwright) or run the app's frontend test suite.
- **Database/model changes**: Verify migrations run, data can be created/read/updated/deleted through the ORM or API, and constraints are enforced.
- **Integration points**: When a feature spans frontend ↔ backend, verify the full round-trip: UI action → API call → database → response → UI update.
- **Configuration/settings**: Verify the setting is actually read and affects behavior — not just that the config key exists.

### What Does NOT Count as Functional Verification

These are useful but insufficient on their own:

- ❌ `bun run check` / `tsc --noEmit` (type checking)
- ❌ `bun run lint` / `eslint` / `ruff` (linting)
- ❌ `python manage.py check` (Django system checks)
- ❌ `bun run build` succeeding (build pipeline)
- ❌ Reading the code and concluding "this looks correct"
- ❌ Verifying file existence or import structure

### The Iterate-Until-Working Cycle

When functional verification reveals a problem:

1. **Diagnose** the root cause (not just the symptom).
2. **Fix** via coder dispatch with the specific failure context.
3. **Re-verify** the same functional test that failed.
4. **Repeat** until the feature demonstrably works.

A feature is "done" when it passes functional verification, not when the coder returns without errors. The lead agent must never mark a task complete based solely on a clean coder return — the verification step is mandatory.

### Verification Scope by Change Type

| Change type | Minimum verification |
|---|---|
| New API endpoint | HTTP request with expected response verified |
| New UI feature | Browser-based or test-suite verification of render + interaction |
| Full-stack feature | End-to-end: UI → API → DB → response → UI update |
| Data model change | Migration runs + CRUD operations verified through API or ORM |
| Bug fix | Reproduce the bug scenario, verify it no longer occurs |
| Config/settings | Verify the setting changes observable behavior |
| Refactor (no behavior change) | Existing tests pass + spot-check one behavior path |

## Mandatory Quality Pipeline

**The reviewer and tester agents exist to be used — not decoratively.** Every non-trivial feature must go through the quality pipeline. Skipping reviewers or testers to "save time" creates broken features that cost far more time to debug later.

### Minimum Quality Requirements

- **Every feature gets a reviewer pass.** No exceptions for "simple" features — the session transcript showed that even apparently simple features (like provider selection) had critical bugs that a reviewer would have caught.
- **Every feature with user-facing behavior gets a tester pass.** The tester agent must be dispatched for any feature that a user would interact with. The tester validates functional behavior, not just code structure.
- **Features cannot be batch-validated.** Each feature gets its own review → test cycle. "I'll review all 6 workstreams at the end" is not acceptable — bugs compound and become harder to diagnose.

### The Lead Must Not Skip the Pipeline Under Time Pressure

Even when there are many features to implement, the quality pipeline is non-negotiable. It is better to ship 3 working features than 6 broken ones. If scope must be reduced to maintain quality, reduce scope — do not reduce quality.

## Requirement Understanding Verification

Before implementing a feature, the lead must verify its understanding of what the user actually wants — especially for features involving:

- **User-facing behavior models** (e.g., "the app should learn from my data" vs. "the user manually inputs preferences")
- **Implicit expectations** (e.g., "show available providers" implies showing which ones are *configured*, not just listing all possible providers)
- **Domain-specific concepts** (e.g., in a travel app, "preferences" might mean auto-learned travel patterns, not a settings form)

When in doubt, ask. A 30-second clarification prevents hours of rework on a fundamentally misunderstood feature.

This complements the Clarification Rule above — that rule covers *ambiguous requirements*; this rule covers *requirements that seem clear but may be misunderstood*. The test: "If I'm wrong about what this means, would I build something completely different?" If yes, verify.