dotfiles/.config/opencode/AGENTS.md at 95974224f87f847a4d35800878ef0e1eb06043fc

Files

alex wiesner 95974224f8 chore: update opencode workflow and local config

2026-03-12 12:14:33 +00:00

24 KiB

Raw Blame History

title, type, permalink

title	type	permalink
AGENTS	note	opencode-config/agents

MCP Code-Indexing Tooling

This repo configures two MCP servers for structural code analysis, registered in opencode.jsonc:

Server	Purpose	Runtime
`ast-grep`	Structural pattern search (AST-level grep)	`uvx` (Python/uv)
`codebase-memory`	Relationship mapping, dependency graphs, blast-radius analysis	`codebase-memory-mcp` binary

Prerequisites

uvx available on PATH (for ast-grep)
codebase-memory-mcp binary installed and on PATH

Intended usage layering

When analyzing code, prefer tools in this order to minimize overhead:

ast-grep first — structural pattern matching; fastest and most targeted
codebase-memory next — relationship/blast-radius queries when structure alone is insufficient

Role allocation (summary)

Per-agent tooling scopes are defined in each agent's guidance file under agents/. The high-level allocation:

coder: ast-grep only for targeted implementation discovery; avoid codebase-memory unless explicitly needed
explorer / researcher / sme / reviewer / tester: ast-grep + codebase-memory
critic / designer / librarian: no code-indexing tooling guidance (use standard tools)

Detailed per-agent behavioral guidance lives in agents/*.md files and is not duplicated here.

Memory System (Single: basic-memory)

Memory uses one persistent system: basic-memory.

All persistent knowledge is stored in basic-memory notes, split across a main project (global/shared) and per-repo projects (project-specific).
The managed per-repo basic-memory project directory is <repo>/.memory/.
Do not edit managed .memory/* files directly; use basic-memory MCP tools for all reads/writes.
Migration note: Older repo-local memory workflow artifacts (including .memory.legacy/ and legacy contents from prior workflows) are non-authoritative and should not be edited unless you are explicitly migrating historical content into basic-memory.

basic-memory

basic-memory is an MCP server that provides persistent knowledge through structured markdown files indexed in SQLite with semantic search.

`main` vs per-repo projects

basic-memory organizes notes into projects. Two kinds exist:

main (global/shared knowledge only)
- Reusable coding patterns (error handling, testing, logging)
- Technology knowledge (how libraries/frameworks/tools work)
- Convention preferences (coding style decisions that span projects)
- Domain concepts that apply across projects
- Cross-project lessons learned and retrospectives
- SME guidance that isn't project-specific
- User preferences and personal workflow notes
Per-repo projects (project-specific knowledge only)
- Plans, decisions, research, gates, and session continuity for ONE repository
- Project architecture and module knowledge
- Project-specific conventions and patterns

Hard rule: Never store project-specific plans, decisions, research, gates, or sessions in main. Never store cross-project reusable knowledge in a per-repo project.

Per-repo project setup (required)

Every code repository must have its own dedicated basic-memory project. This is non-negotiable.

Creating a new per-repo project: Use basic-memory_create_memory_project (or the equivalent MCP tool) with:

project_name: a short, kebab-case identifier for the repo (e.g., opencode-config, my-web-app, data-pipeline)
project_path: the repo's .memory/ subdirectory on disk (i.e., <repo-root>/.memory)

Example for this repo:

project_name: opencode-config
project_path: /home/alex/dotfiles/.config/opencode/.memory

Checking if a project exists: Use basic-memory_list_memory_projects to see all projects. If the repo doesn't have one yet, create it before reading/writing project-specific notes.

This repo's basic-memory project: opencode-config

MCP tools (available to all agents)

write_note(title, content, folder, tags, project) — create/update a knowledge note
read_note(identifier, project) — read a specific note by title or permalink
search_notes(query, project) — semantic + full-text search across all notes
build_context(url, depth, project) — follow knowledge graph relations for deep context
recent_activity(type, project) — find recently added/updated notes
list_memory_projects() — list all basic-memory projects
create_memory_project(project_name, project_path) — create a new per-repo project

The project parameter is critical. Always pass project="main" for global notes and project="<repo-project-name>" for project-specific notes. Omitting the project parameter defaults to main.

Note format:

---
title: Go Error Handling Patterns
permalink: go-error-handling-patterns
tags:
- go
- patterns
- error-handling
---
# Go Error Handling Patterns

## Observations
- [pattern] Use sentinel errors for expected error conditions #go
- [convention] Wrap errors with fmt.Errorf("context: %w", err) #go

## Relations
- related_to [[Go Testing Patterns]]

Usage rules:

At session start, identify the repo's basic-memory project (see Session-Start Protocol below).
Use project parameter on every MCP call to target the correct project.
After completing work with reusable lessons, use write_note with project="main" to record them.
Use WikiLinks [[Topic]] to create relations between notes.
Use tags for categorization: #pattern, #convention, #sme, #lesson, etc.
Use observation categories: [pattern], [convention], [decision], [lesson], [risk], [tool].

Session-start protocol (required)

At the start of every session, before reading or writing any project-specific notes:

Identify the repo. Determine which repository you are working in (from the working directory or user context).
Select the per-repo project. Use basic-memory_list_memory_projects to find the repo's basic-memory project. If it doesn't exist, create it with basic-memory_create_memory_project.
Load project context. Query the per-repo project (search_notes/build_context with project="<repo-project-name>") for relevant prior work, pending decisions, and in-progress items.
Load global context. Query main (search_notes with project="main") for relevant cross-project knowledge when the task domain may have reusable guidance.

All subsequent project-specific reads/writes in the session must target the per-repo project. All global/shared reads/writes must target main.

Project-specific note organization

Project notes in the per-repo basic-memory project are grouped by purpose:

knowledge/ — project architecture, modules, conventions, patterns
plans/ — one note per feature/task with scope, tasks, acceptance criteria
decisions/ — ADRs, SME guidance, design choices
research/ — investigation findings
gates/ — quality gate records (reviewer/tester verdicts)
sessions/ — session continuity notes

Use stable identifiers so agents can pass note references between delegations.

Workflow: load context → work → update basic-memory

Session start: Follow the session-start protocol above.
Before each task: Read relevant notes from the per-repo project (plans/decisions/research/sessions) and from main for reusable guidance.
After each task: Update project notes in the per-repo project (plans, decisions, research, gates, sessions). Record reusable lessons in main.
Quality gates: Record reviewer/tester outcomes in the per-repo project's gates/ notes.

Recording discipline: Only record outcomes, decisions, and discoveries — never phase transitions, status changes, or ceremony checkpoints. If an entry would only say "we started phase X", don't add it. Memory notes preserve knowledge, not activity logs.

Read discipline:

Read only the basic-memory notes relevant to the current task
Skip redundant reads when the per-repo project already has no relevant content in that domain this session
Do not immediately re-read content you just wrote
Treat memory as a tool, not a ritual

Linking is required. When recording related knowledge across notes, add markdown cross-references and use memory:// links where relevant.

When to Use Which

Knowledge type	Where to store	Project	Why
Reusable pattern/convention	`write_note`	`main`	Benefits all projects
SME guidance (general)	`write_note`	`main`	Reusable across consultations
Tech knowledge (general)	`write_note`	`main`	Reusable reference
Lessons learned	`write_note`	`main`	Cross-project value
User preferences	`write_note`	`main`	Span all projects
Project architecture	`knowledge/*` notes	per-repo project	Specific to this project
Active plans & gates	`plans/` and `gates/` notes	per-repo project	Project lifecycle state
Session continuity	`sessions/*` notes	per-repo project	Project-scoped session tracking
Project decisions (ADRs)	`decisions/*` notes	per-repo project	Specific to this project
Project research	`research/*` notes	per-repo project	Tied to project context

Instruction File

AGENTS.md is the only instruction file that should be maintained in this repo.

Rules:

Put project instructions in AGENTS.md only
Do not create or maintain mirrored instruction files or symlinks for other tools
If another tool needs repo instructions, point it at AGENTS.md directly

Content of this file:

Project overview and purpose
Tech stack and architecture
Coding conventions and patterns
Build/test/lint commands
Project structure overview

Do NOT duplicate memory project contents — AGENTS.md describes how to work with the project, not active plans, research, or decisions.

When initializing or updating a project:

Create or update AGENTS.md with project basics
Keep instruction maintenance centralized in AGENTS.md

When joining an existing project:

Read AGENTS.md to understand the project
If the instruction file is missing, create AGENTS.md

Session Continuity

Treat the per-repo basic-memory project as the persistent tracking system for work across sessions.
At session start, query basic-memory (search_notes/build_context) for relevant prior work, pending decisions, and in-progress items.
After implementation, update project notes in basic-memory with what changed, why it changed, and what remains next.
If the work produced reusable knowledge (patterns, conventions, lessons learned), also record it in reusable basic-memory notes for cross-project benefit.

This repo's basic-memory project: opencode-config

Clarification Rule

If requirements are genuinely unclear, materially ambiguous, or have multiple valid interpretations that would lead to materially different implementations, use the question tool to clarify before committing to an implementation path.
Do not ask for clarification when the user's intent is obvious. If the user explicitly states what they want (e.g., "update X and also update Y"), do not ask "should I do both?" — proceed with the stated request.
Implementation-level decisions (naming, file organization, approach) are the agent's job, not the user's. Only escalate decisions that affect user-visible behavior or scope.

Agent Roster

Agent	Role	Model
`lead`	Primary orchestrator that decomposes work, delegates, and synthesizes outcomes.	`github-copilot/claude-opus-4` (global default)
`coder`	Implementation-focused coding agent for reliable code changes.	`github-copilot/gpt-5.3-codex`
`reviewer`	Read-only code/source review; records verdicts in basic-memory project notes.	`github-copilot/claude-opus-4.6`
`tester`	Validation agent for standard + adversarial testing; records outcomes in basic-memory project notes.	`github-copilot/claude-sonnet-4.6`
`explorer`	Fast read-only codebase mapper; records discoveries in basic-memory project notes.	`github-copilot/claude-sonnet-4.6`
`researcher`	Deep technical investigator; records findings in basic-memory project notes.	`github-copilot/claude-opus-4.6`
`librarian`	Documentation coverage and accuracy specialist.	`github-copilot/claude-opus-4.6`
`critic`	Pre-implementation gate and blocker sounding board; records verdicts in basic-memory project notes.	`github-copilot/claude-opus-4.6`
`sme`	Subject-matter expert for domain-specific consultation; records guidance in basic-memory notes.	`github-copilot/claude-opus-4.6`
`designer`	UI/UX specialist for interaction and visual guidance; records design decisions in basic-memory project notes.	`github-copilot/claude-sonnet-4.6`

All agents except lead, coder, and librarian are code/source read-only. Agents with permission.edit: allow may update basic-memory notes for their recording duties; they must not edit implementation source files.

Explorer Scope Boundary

Explorer is local-only. Use explorer only for mapping files, directories, symbols, dependencies, configuration, and edit points that already exist inside the current repository/worktree.
Do not use explorer for external research. Repository discovery on GitHub, upstream project behavior, package/library docs, web content, or competitor/tool comparisons belong to researcher or direct Lead research tools (gh, webfetch, docs lookup).
Do not mix local and external discovery in one explorer prompt. If a task needs both, split it explicitly:
1. explorer → local file map only
2. researcher or Lead tools → external behavior/references only
3. Lead → synthesize the results
Explorer outputs should stay concrete: local file paths, likely edit points, dependency chains, and risks inside this repo only.

Parallelization

Always parallelize independent work. Any tool calls that do not depend on each other's output must be issued in the same message as parallel calls — never sequentially. This applies to bash commands, file reads, and subagent delegations alike.
Before issuing a sequence of calls, ask: "Does call B require the result of call A?" If not, send them together.

Skill Loading Policy

Relevant skills are not optional. When a task matches a skill's trigger conditions, the lead must load that skill proactively before proceeding with ad hoc execution.
Keep skill usage operational: use skills to drive planning, decomposition, debugging, verification, and workflow enforcement instead of relying on generic reminders.
AGENTS.md defines this as policy; concrete skill trigger rules and enforcement behavior belong in agents/lead.md.

Human Checkpoint Triggers

When implementing features, the Lead must stop and request explicit user approval before dispatching coder work in these situations:

Security-sensitive design: Any feature involving encryption, auth flows, secret storage, token management, or permission model changes.
Architectural ambiguity: Multiple valid approaches with materially different tradeoffs that aren't resolvable from codebase conventions alone.
Vision-dependent features: Features where the user's intended UX or behavior model isn't fully specified by the request.
New external dependencies: Adding a service, SDK, or infrastructure component not already in the project.
Data model changes with migration impact: Schema changes affecting existing production data.

The checkpoint must present the specific decision, 2-3 concrete options with tradeoffs, a recommendation, and a safe default. Implementation-level decisions (naming, file organization, code patterns) are NOT checkpoints — only user-visible behavior and architectural choices qualify.

Functional Verification (Implement → Verify → Iterate)

Static analysis is not verification. Type checks (bun run check, tsc), linters (eslint, ruff), and framework system checks (python manage.py check) confirm code is syntactically and structurally valid. They do NOT confirm the feature works. A feature that type-checks perfectly can be completely non-functional.

Every implemented feature MUST be functionally verified before being marked complete. "Functionally verified" means demonstrating that the feature actually works end-to-end — not just that it compiles.

What Counts as Functional Verification

Functional verification must exercise the actual behavior path a user would trigger:

API endpoints: Make real HTTP requests (curl, httpie, or the app's test client) and verify response status, shape, and data correctness. Check both success and error paths.
Frontend components: Verify the component renders, interacts correctly, and communicates with the backend. Use the browser (Playwright) or run the app's frontend test suite.
Database/model changes: Verify migrations run, data can be created/read/updated/deleted through the ORM or API, and constraints are enforced.
Integration points: When a feature spans frontend ↔ backend, verify the full round-trip: UI action → API call → database → response → UI update.
Configuration/settings: Verify the setting is actually read and affects behavior — not just that the config key exists.

What Does NOT Count as Functional Verification

These are useful but insufficient on their own:

❌ bun run check / tsc --noEmit (type checking)
❌ bun run lint / eslint / ruff (linting)
❌ python manage.py check (Django system checks)
❌ bun run build succeeding (build pipeline)
❌ Reading the code and concluding "this looks correct"
❌ Verifying file existence or import structure

The Iterate-Until-Working Cycle

When functional verification reveals a problem:

Diagnose the root cause (not just the symptom).
Fix via coder dispatch with the specific failure context.
Re-verify the same functional test that failed.
Repeat until the feature demonstrably works.

A feature is "done" when it passes functional verification, not when the coder returns without errors. The lead agent must never mark a task complete based solely on a clean coder return — the verification step is mandatory.

Verification Scope by Change Type

Change type	Minimum verification
New API endpoint	HTTP request with expected response verified
New UI feature	Browser-based or test-suite verification of render + interaction
Full-stack feature	End-to-end: UI → API → DB → response → UI update
Data model change	Migration runs + CRUD operations verified through API or ORM
Bug fix	Reproduce the bug scenario, verify it no longer occurs
Config/settings	Verify the setting changes observable behavior
Refactor (no behavior change)	Existing tests pass + spot-check one behavior path

Mandatory Quality Pipeline

The reviewer and tester agents exist to be used — not decoratively. Every non-trivial feature must go through the quality pipeline. Skipping reviewers or testers to "save time" creates broken features that cost far more time to debug later.

Minimum Quality Requirements

Every feature gets a reviewer pass. No exceptions for "simple" features — the session transcript showed that even apparently simple features (like provider selection) had critical bugs that a reviewer would have caught.
Every feature with user-facing behavior gets a tester pass. The tester agent must be dispatched for any feature that a user would interact with. The tester validates functional behavior, not just code structure.
Features cannot be batch-validated. Each feature gets its own review → test cycle. "I'll review all 6 workstreams at the end" is not acceptable — bugs compound and become harder to diagnose.

The Lead Must Not Skip the Pipeline Under Time Pressure

Even when there are many features to implement, the quality pipeline is non-negotiable. It is better to ship 3 working features than 6 broken ones. If scope must be reduced to maintain quality, reduce scope — do not reduce quality.

Requirement Understanding Verification

Before implementing a feature, the lead must verify its understanding of what the user actually wants — especially for features involving:

User-facing behavior models (e.g., "the app should learn from my data" vs. "the user manually inputs preferences")
Implicit expectations (e.g., "show available providers" implies showing which ones are configured, not just listing all possible providers)
Domain-specific concepts (e.g., in a travel app, "preferences" might mean auto-learned travel patterns, not a settings form)

When in doubt, ask. A 30-second clarification prevents hours of rework on a fundamentally misunderstood feature.

This complements the Clarification Rule above — that rule covers ambiguous requirements; this rule covers requirements that seem clear but may be misunderstood. The test: "If I'm wrong about what this means, would I build something completely different?" If yes, verify.

Proactive Bug Search

Do not limit quality work to the requested diff. The Lead should actively search for likely related defects before and after implementation.

Minimum proactive bug-hunt pass

For any non-trivial feature or bug fix, inspect nearby risk surfaces in addition to the primary edit point:

sibling components/handlers in the same feature area
duplicated or copy-pasted logic paths
recent churn hotspots and TODO/FIXME comments
adjacent validation, error handling, empty-state, and permission logic
parallel codepaths that should stay behaviorally consistent

This pass is not open-ended archaeology; it is a focused search for bugs that are likely to be coupled to the requested work.

Discovery and review expectations

During DISCOVER, include a short "likely bug surfaces" list in the findings when the task is non-trivial.
During EXECUTE, require reviewer and tester prompts to check for related regressions and likely adjacent bugs, not just direct spec compliance.
If proactive bug hunting finds unrelated non-blocking issues, record them in project memory or a backlog note rather than silently folding them into the current task.
If a discovered bug is blocking correctness of the current task, treat it as in-scope and explicitly add it to the plan.

Bug-fix workflow

Prefer reproduction-first debugging: capture the failing scenario, failing test, or concrete bug path before fixing when feasible.
After the fix, re-run the same scenario as the primary verification step.
For bug fixes without an automated regression test, document the exact manual reproduction and re-verification path.

Planning Rigor

Planning should be detailed enough to reduce rework, not just to describe intent.

Plan minimums

Every non-trivial plan must include, per task or feature:

the exact user-visible outcome
explicit acceptance criteria
edge cases and error cases
non-goals / what is intentionally out of scope
verification method
impacted files, systems, or integration surfaces
likely breakage or regression surfaces

Required pre-mortem

Before EXECUTE, add a short pre-mortem section to the plan for non-trivial work:

what is most likely to fail
which assumption is most fragile
what would force a redesign or user checkpoint
what regression is easiest to miss

The goal is to surface rework risks early, before coder dispatch.

Retry learning loop

When review or testing fails and a retry is needed, update the plan with a brief note covering:

what was misunderstood or missed
what new constraint was discovered
what changed in the execution approach

Do not resend the same plan unchanged after a failed cycle unless the failure was purely mechanical.

24 KiB Raw Blame History