Files
dotfiles/.config/opencode/AGENTS.md
alex 457fb2068b feat: integrate basic-memory MCP for dual memory system
Add basic-memory as a global cross-project knowledge store alongside
the existing per-project .memory/ system. Agents can now persist
reusable patterns, conventions, and lessons learned across all projects
via MCP tools (write_note, search_notes, build_context).

Changes:
- opencode.jsonc: add basic-memory MCP server config
- AGENTS.md: rewrite Project Memory section for dual system with
  routing table (global vs per-project)
- agents/lead.md: integrate basic-memory into phase transitions,
  CONSULT, PHASE-WRAP, escalation, and knowledge freshness
- agents/sme.md: dual caching strategy (basic-memory for cross-project
  guidance, .memory/ for project-specific)
2026-03-09 17:33:04 +00:00

17 KiB

Memory System (Dual: Global + Per-Project)

Memory is split into two complementary systems:

  1. Global Memory (basic-memory) — cross-project knowledge via MCP server. Stores reusable patterns, conventions, tech knowledge, domain concepts, and lessons learned. Lives in ~/basic-memory/, accessed via MCP tools.
  2. Project Memory (.memory/) — per-project state committed to git. Stores plans, gates, sessions, project-specific decisions, research, and architecture. Lives in the project's .memory/ directory.

Global Memory (basic-memory)

basic-memory is an MCP server that provides persistent knowledge through structured markdown files indexed in SQLite with semantic search.

What goes in global memory:

  • Reusable coding patterns (error handling, testing, logging)
  • Technology knowledge (how libraries/frameworks/tools work)
  • Convention preferences (coding style decisions that span projects)
  • Domain concepts that apply across projects
  • Cross-project lessons learned and retrospectives
  • SME guidance that isn't project-specific

MCP tools (available to all agents):

  • write_note(title, content, folder, tags) — create/update a knowledge note
  • read_note(identifier) — read a specific note by title or permalink
  • search_notes(query) — semantic + full-text search across all notes
  • build_context(url, depth) — follow knowledge graph relations for deep context
  • recent_activity(type) — find recently added/updated notes

Note format:

---
title: Go Error Handling Patterns
permalink: go-error-handling-patterns
tags:
- go
- patterns
- error-handling
---
# Go Error Handling Patterns

## Observations
- [pattern] Use sentinel errors for expected error conditions #go
- [convention] Wrap errors with fmt.Errorf("context: %w", err) #go

## Relations
- related_to [[Go Testing Patterns]]

Usage rules:

  • At session start, use search_notes or build_context to find relevant global knowledge before starting work.
  • After completing work with reusable lessons, use write_note to record them globally.
  • Use WikiLinks [[Topic]] to create relations between notes.
  • Use tags for categorization: #pattern, #convention, #sme, #lesson, etc.
  • Use observation categories: [pattern], [convention], [decision], [lesson], [risk], [tool].

Project Memory (.memory/)

Per-project state, committed to git. This is the source of truth for active project work.

Directory structure:

.memory/
├── manifest.yaml           # Index: all files with descriptions + groups
├── system.md               # One-paragraph project overview
│
├── knowledge/              # Project-specific knowledge
│   ├── overview.md         # THIS project's architecture
│   └── tech-stack.md       # THIS project's technologies
│
├── decisions.md            # Project-specific Architecture Decision Records
│
├── plans/                  # Active plans (one per feature)
│   └── <feature>.md
│
├── research/               # Project-specific research findings
│   └── <topic>.md
│
├── gates/                  # Quality gate records
│   └── <feature>.md        # Review + test outcomes
│
└── sessions/               # Session continuity
    └── continuity.md       # Rolling session notes

Workflow: global context → project context → work → update both

  1. Session start: Query basic-memory (search_notes) for relevant global context, then read .memory/manifest.yaml and relevant project files.
  2. Before each task: Check global memory for reusable patterns/guidance, then read relevant .memory/ files for project state.
  3. After each task: Update .memory/ for project state. If the work produced reusable knowledge, also write_note to basic-memory.
  4. Quality gates: Record reviewer/tester outcomes in .memory/gates/<feature>.md (project-specific).

Manifest schema:

name: <project-name>
version: 1
categories:
  - path: <relative-path>
    description: <one-line description>
    group: <knowledge|decisions|plans|research|gates|sessions>

Recording discipline: Only record outcomes, decisions, and discoveries — never phase transitions, status changes, or ceremony checkpoints. If an entry would only say "we started phase X", don't add it. Memory files preserve knowledge, not activity logs.

Read discipline:

  • Read manifest.yaml first to discover what's available
  • Read only the .memory/ files relevant to the current task
  • Skip redundant reads when .memory/ already has no relevant content in that domain this session
  • Do not immediately re-read content you just wrote
  • Treat .memory/ as a tool, not a ritual

Linking is required. When recording related knowledge across files, add markdown cross-references (for example: See [Decision: Auth](decisions.md#auth-approach)). Cross-reference global memory notes with memory://permalink URLs when relevant.

Manifest maintenance: When creating new .memory/ files, add entries to manifest.yaml with path, description, and group. The librarian agent verifies manifest accuracy.

When to Use Which

Knowledge type Where to store Why
Reusable pattern/convention basic-memory (write_note) Benefits all projects
SME guidance (general) basic-memory (write_note) Reusable across consultations
Project architecture .memory/knowledge/ Specific to this project
Active plans & gates .memory/plans/, .memory/gates/ Project lifecycle state
Session continuity .memory/sessions/ Project-scoped session tracking
Project decisions (ADRs) .memory/decisions.md Specific to this project
Project research .memory/research/ Tied to project context
Tech knowledge (general) basic-memory (write_note) Reusable reference
Lessons learned basic-memory (write_note) Cross-project value

Cross-Tool Instruction Files

Use symlinks to share this instruction file across all agentic coding tools:

project/
├── AGENTS.md                          # Real file (edit this one)
├── CLAUDE.md -> AGENTS.md
├── .cursorrules -> AGENTS.md
└── .github/
    └── copilot-instructions.md -> ../AGENTS.md

Rules:

  • Edit AGENTS.md — changes propagate automatically via symlinks
  • Never edit symlinked files directly (changes would be lost)
  • Symlinks are committed to git (git tracks them natively)

Content of this file:

  • Project overview and purpose
  • Tech stack and architecture
  • Coding conventions and patterns
  • Build/test/lint commands
  • Project structure overview

Do NOT duplicate .memory/ contents — instruction files describe how to work with the project, not active plans, research, or decisions.

When initializing a project:

  1. Create AGENTS.md with project basics
  2. Create symlinks: ln -s AGENTS.md CLAUDE.md, ln -s AGENTS.md .cursorrules, ln -s ../AGENTS.md .github/copilot-instructions.md
  3. Commit the real file and symlinks to git

When joining an existing project:

  • Read AGENTS.md (or any of the symlinked files) to understand the project
  • If instruction file is missing, create it and the symlinks

Session Continuity

  • Treat .memory/ files as the persistent project tracking system for work across sessions.
  • At session start, query basic-memory (search_notes) for relevant cross-project knowledge, then identify prior in-progress work items and pending decisions in .memory/ before doing new implementation.
  • After implementation, update .memory/ files with what changed, why it changed, and what remains next.
  • If the work produced reusable knowledge (patterns, conventions, lessons learned), also record it in basic-memory (write_note) for cross-project benefit.

Clarification Rule

  • If requirements are genuinely unclear, materially ambiguous, or have multiple valid interpretations that would lead to materially different implementations, use the question tool to clarify before committing to an implementation path.
  • Do not ask for clarification when the user's intent is obvious. If the user explicitly states what they want (e.g., "update X and also update Y"), do not ask "should I do both?" — proceed with the stated request.
  • Implementation-level decisions (naming, file organization, approach) are the agent's job, not the user's. Only escalate decisions that affect user-visible behavior or scope.

Agent Roster

Agent Role Model
lead Primary orchestrator that decomposes work, delegates, and synthesizes outcomes. github-copilot/claude-opus-4 (global default)
coder Implementation-focused coding agent for reliable code changes. github-copilot/gpt-5.3-codex
reviewer Read-only code/source review; writes .memory/* for verdict records. github-copilot/claude-opus-4.6
tester Validation agent for standard + adversarial testing; writes .memory/* for test outcomes. github-copilot/claude-sonnet-4.6
explorer Fast read-only codebase mapper; writes .memory/* for discovery records. github-copilot/claude-sonnet-4.6
researcher Deep technical investigator; writes .memory/* for research findings. github-copilot/claude-opus-4.6
librarian Documentation coverage and accuracy specialist. github-copilot/claude-opus-4.6
critic Pre-implementation gate and blocker sounding board; writes .memory/* for verdicts. github-copilot/claude-opus-4.6
sme Subject-matter expert for domain-specific consultation; writes .memory/* for guidance cache. github-copilot/claude-opus-4.6
designer UI/UX specialist for interaction and visual guidance; writes .memory/* for design decisions. github-copilot/claude-sonnet-4.6

All agents except lead, coder, and librarian are code/source read-only but have permission.edit: allow scoped to .memory/* writes for their recording duties. The lead and librarian have full edit access; coder has full edit access for implementation.

Parallelization

  • Always parallelize independent work. Any tool calls that do not depend on each other's output must be issued in the same message as parallel calls — never sequentially. This applies to bash commands, file reads, and subagent delegations alike.
  • Before issuing a sequence of calls, ask: "Does call B require the result of call A?" If not, send them together.

Human Checkpoint Triggers

When implementing features, the Lead must stop and request explicit user approval before dispatching coder work in these situations:

  1. Security-sensitive design: Any feature involving encryption, auth flows, secret storage, token management, or permission model changes.
  2. Architectural ambiguity: Multiple valid approaches with materially different tradeoffs that aren't resolvable from codebase conventions alone.
  3. Vision-dependent features: Features where the user's intended UX or behavior model isn't fully specified by the request.
  4. New external dependencies: Adding a service, SDK, or infrastructure component not already in the project.
  5. Data model changes with migration impact: Schema changes affecting existing production data.

The checkpoint must present the specific decision, 2-3 concrete options with tradeoffs, a recommendation, and a safe default. Implementation-level decisions (naming, file organization, code patterns) are NOT checkpoints — only user-visible behavior and architectural choices qualify.

Functional Verification (Implement → Verify → Iterate)

Static analysis is not verification. Type checks (bun run check, tsc), linters (eslint, ruff), and framework system checks (python manage.py check) confirm code is syntactically and structurally valid. They do NOT confirm the feature works. A feature that type-checks perfectly can be completely non-functional.

Every implemented feature MUST be functionally verified before being marked complete. "Functionally verified" means demonstrating that the feature actually works end-to-end — not just that it compiles.

What Counts as Functional Verification

Functional verification must exercise the actual behavior path a user would trigger:

  • API endpoints: Make real HTTP requests (curl, httpie, or the app's test client) and verify response status, shape, and data correctness. Check both success and error paths.
  • Frontend components: Verify the component renders, interacts correctly, and communicates with the backend. Use the browser (Playwright) or run the app's frontend test suite.
  • Database/model changes: Verify migrations run, data can be created/read/updated/deleted through the ORM or API, and constraints are enforced.
  • Integration points: When a feature spans frontend ↔ backend, verify the full round-trip: UI action → API call → database → response → UI update.
  • Configuration/settings: Verify the setting is actually read and affects behavior — not just that the config key exists.

What Does NOT Count as Functional Verification

These are useful but insufficient on their own:

  • bun run check / tsc --noEmit (type checking)
  • bun run lint / eslint / ruff (linting)
  • python manage.py check (Django system checks)
  • bun run build succeeding (build pipeline)
  • Reading the code and concluding "this looks correct"
  • Verifying file existence or import structure

The Iterate-Until-Working Cycle

When functional verification reveals a problem:

  1. Diagnose the root cause (not just the symptom).
  2. Fix via coder dispatch with the specific failure context.
  3. Re-verify the same functional test that failed.
  4. Repeat until the feature demonstrably works.

A feature is "done" when it passes functional verification, not when the coder returns without errors. The lead agent must never mark a task complete based solely on a clean coder return — the verification step is mandatory.

Verification Scope by Change Type

Change type Minimum verification
New API endpoint HTTP request with expected response verified
New UI feature Browser-based or test-suite verification of render + interaction
Full-stack feature End-to-end: UI → API → DB → response → UI update
Data model change Migration runs + CRUD operations verified through API or ORM
Bug fix Reproduce the bug scenario, verify it no longer occurs
Config/settings Verify the setting changes observable behavior
Refactor (no behavior change) Existing tests pass + spot-check one behavior path

Mandatory Quality Pipeline

The reviewer and tester agents exist to be used — not decoratively. Every non-trivial feature must go through the quality pipeline. Skipping reviewers or testers to "save time" creates broken features that cost far more time to debug later.

Minimum Quality Requirements

  • Every feature gets a reviewer pass. No exceptions for "simple" features — the session transcript showed that even apparently simple features (like provider selection) had critical bugs that a reviewer would have caught.
  • Every feature with user-facing behavior gets a tester pass. The tester agent must be dispatched for any feature that a user would interact with. The tester validates functional behavior, not just code structure.
  • Features cannot be batch-validated. Each feature gets its own review → test cycle. "I'll review all 6 workstreams at the end" is not acceptable — bugs compound and become harder to diagnose.

The Lead Must Not Skip the Pipeline Under Time Pressure

Even when there are many features to implement, the quality pipeline is non-negotiable. It is better to ship 3 working features than 6 broken ones. If scope must be reduced to maintain quality, reduce scope — do not reduce quality.

Requirement Understanding Verification

Before implementing a feature, the lead must verify its understanding of what the user actually wants — especially for features involving:

  • User-facing behavior models (e.g., "the app should learn from my data" vs. "the user manually inputs preferences")
  • Implicit expectations (e.g., "show available providers" implies showing which ones are configured, not just listing all possible providers)
  • Domain-specific concepts (e.g., in a travel app, "preferences" might mean auto-learned travel patterns, not a settings form)

When in doubt, ask. A 30-second clarification prevents hours of rework on a fundamentally misunderstood feature.

This complements the Clarification Rule above — that rule covers ambiguous requirements; this rule covers requirements that seem clear but may be misunderstood. The test: "If I'm wrong about what this means, would I build something completely different?" If yes, verify.