dotfiles/.config/opencode/AGENTS.md at 6727c91889e0dd1189b0df396cf3ce4832989044

Files

alex 33180d6e04 fix: flip symlink structure - AGENTS.md is the real file

AGENTS.md is now the canonical instruction file, with CLAUDE.md,
.cursorrules, and .github/copilot-instructions.md as symlinks to it.

This is simpler and more intuitive - the main file is at the root,
not buried in .github/.

Updated all references across agents, commands, skills, and .memory/.

2026-03-09 12:34:26 +00:00

12 KiB

Raw Blame History

Project Memory

Use markdown files in .memory/ as the persistent project memory across sessions. This is the source of truth for architecture, decisions, plans, research, and implementation state.

Directory structure:

.memory/
  knowledge.md      # Persistent project knowledge (architecture, patterns, key concepts)
  decisions.md      # Architecture decisions, SME guidance, design choices
  plans/            # One file per active plan/feature
    <feature>.md    # Plan with tasks, statuses, acceptance criteria
  research/         # Research findings
    <topic>.md      # Research on a specific topic

Workflow: read files → work → update files

Session start: Read .memory/ directory contents and skim .memory/knowledge.md.
Before each task: Read relevant .memory/*.md files before reading source files for project understanding.
After each task: Update the appropriate .memory/*.md files with what was built.

Be specific in summaries: include parameter names, defaults, file locations, and rationale. Keep concepts organized as markdown sections (## Heading) and keep hierarchy shallow.

Recording discipline: Only record outcomes, decisions, and discoveries — never phase transitions, status changes, or ceremony checkpoints. If an entry would only say "we started phase X", don't add it. Memory files preserve knowledge, not activity logs.

Read discipline:

Read only the .memory/ files relevant to the current task; avoid broad re-reads that add no new signal.
Skip redundant reads when .memory/ already has no relevant content in that domain this session.
Do not immediately re-read content you just wrote. You already have that context from the update.
Treat .memory/ as a tool, not a ritual. Every read should have a specific information need.

Linking is required. When recording related knowledge across files, add markdown cross-references (for example: See [Decision: Auth](decisions.md#auth-approach)). A section with no references becomes a dead end.

Cross-Tool Instruction Files

Use symlinks to share this instruction file across all agentic coding tools:

project/
├── AGENTS.md                          # Real file (edit this one)
├── CLAUDE.md -> AGENTS.md
├── .cursorrules -> AGENTS.md
└── .github/
    └── copilot-instructions.md -> ../AGENTS.md

Rules:

Edit AGENTS.md — changes propagate automatically via symlinks
Never edit symlinked files directly (changes would be lost)
Symlinks are committed to git (git tracks them natively)

Content of this file:

Project overview and purpose
Tech stack and architecture
Coding conventions and patterns
Build/test/lint commands
Project structure overview

Do NOT duplicate .memory/ contents — instruction files describe how to work with the project, not active plans, research, or decisions.

When initializing a project:

Create AGENTS.md with project basics
Create symlinks: ln -s AGENTS.md CLAUDE.md, ln -s AGENTS.md .cursorrules, ln -s ../AGENTS.md .github/copilot-instructions.md
Commit the real file and symlinks to git

When joining an existing project:

Read AGENTS.md (or any of the symlinked files) to understand the project
If instruction file is missing, create it and the symlinks

Session Continuity

Treat .memory/ files as the persistent tracking system for work across sessions.
At session start, identify prior in-progress work items and pending decisions before doing new implementation.
After implementation, update .memory/ files with what changed, why it changed, and what remains next.

Clarification Rule

If requirements are genuinely unclear, materially ambiguous, or have multiple valid interpretations that would lead to materially different implementations, use the question tool to clarify before committing to an implementation path.
Do not ask for clarification when the user's intent is obvious. If the user explicitly states what they want (e.g., "update X and also update Y"), do not ask "should I do both?" — proceed with the stated request.
Implementation-level decisions (naming, file organization, approach) are the agent's job, not the user's. Only escalate decisions that affect user-visible behavior or scope.

Agent Roster

Agent	Role	Model
`lead`	Primary orchestrator that decomposes work, delegates, and synthesizes outcomes.	`github-copilot/claude-opus-4` (global default)
`coder`	Implementation-focused coding agent for reliable code changes.	`github-copilot/gpt-5.3-codex`
`reviewer`	Read-only code/source review; writes `.memory/*` for verdict records.	`github-copilot/claude-opus-4.6`
`tester`	Validation agent for standard + adversarial testing; writes `.memory/*` for test outcomes.	`github-copilot/claude-sonnet-4.6`
`explorer`	Fast read-only codebase mapper; writes `.memory/*` for discovery records.	`github-copilot/claude-sonnet-4.6`
`researcher`	Deep technical investigator; writes `.memory/*` for research findings.	`github-copilot/claude-opus-4.6`
`librarian`	Documentation coverage and accuracy specialist.	`github-copilot/claude-opus-4.6`
`critic`	Pre-implementation gate and blocker sounding board; writes `.memory/*` for verdicts.	`github-copilot/claude-opus-4.6`
`sme`	Subject-matter expert for domain-specific consultation; writes `.memory/*` for guidance cache.	`github-copilot/claude-opus-4.6`
`designer`	UI/UX specialist for interaction and visual guidance; writes `.memory/*` for design decisions.	`github-copilot/claude-sonnet-4.6`

All agents except lead, coder, and librarian are code/source read-only but have permission.edit: allow scoped to .memory/* writes for their recording duties. The lead and librarian have full edit access; coder has full edit access for implementation.

Parallelization

Always parallelize independent work. Any tool calls that do not depend on each other's output must be issued in the same message as parallel calls — never sequentially. This applies to bash commands, file reads, and subagent delegations alike.
Before issuing a sequence of calls, ask: "Does call B require the result of call A?" If not, send them together.

Human Checkpoint Triggers

When implementing features, the Lead must stop and request explicit user approval before dispatching coder work in these situations:

Security-sensitive design: Any feature involving encryption, auth flows, secret storage, token management, or permission model changes.
Architectural ambiguity: Multiple valid approaches with materially different tradeoffs that aren't resolvable from codebase conventions alone.
Vision-dependent features: Features where the user's intended UX or behavior model isn't fully specified by the request.
New external dependencies: Adding a service, SDK, or infrastructure component not already in the project.
Data model changes with migration impact: Schema changes affecting existing production data.

The checkpoint must present the specific decision, 2-3 concrete options with tradeoffs, a recommendation, and a safe default. Implementation-level decisions (naming, file organization, code patterns) are NOT checkpoints — only user-visible behavior and architectural choices qualify.

Functional Verification (Implement → Verify → Iterate)

Static analysis is not verification. Type checks (bun run check, tsc), linters (eslint, ruff), and framework system checks (python manage.py check) confirm code is syntactically and structurally valid. They do NOT confirm the feature works. A feature that type-checks perfectly can be completely non-functional.

Every implemented feature MUST be functionally verified before being marked complete. "Functionally verified" means demonstrating that the feature actually works end-to-end — not just that it compiles.

What Counts as Functional Verification

Functional verification must exercise the actual behavior path a user would trigger:

API endpoints: Make real HTTP requests (curl, httpie, or the app's test client) and verify response status, shape, and data correctness. Check both success and error paths.
Frontend components: Verify the component renders, interacts correctly, and communicates with the backend. Use the browser (Playwright) or run the app's frontend test suite.
Database/model changes: Verify migrations run, data can be created/read/updated/deleted through the ORM or API, and constraints are enforced.
Integration points: When a feature spans frontend ↔ backend, verify the full round-trip: UI action → API call → database → response → UI update.
Configuration/settings: Verify the setting is actually read and affects behavior — not just that the config key exists.

What Does NOT Count as Functional Verification

These are useful but insufficient on their own:

❌ bun run check / tsc --noEmit (type checking)
❌ bun run lint / eslint / ruff (linting)
❌ python manage.py check (Django system checks)
❌ bun run build succeeding (build pipeline)
❌ Reading the code and concluding "this looks correct"
❌ Verifying file existence or import structure

The Iterate-Until-Working Cycle

When functional verification reveals a problem:

Diagnose the root cause (not just the symptom).
Fix via coder dispatch with the specific failure context.
Re-verify the same functional test that failed.
Repeat until the feature demonstrably works.

A feature is "done" when it passes functional verification, not when the coder returns without errors. The lead agent must never mark a task complete based solely on a clean coder return — the verification step is mandatory.

Verification Scope by Change Type

Change type	Minimum verification
New API endpoint	HTTP request with expected response verified
New UI feature	Browser-based or test-suite verification of render + interaction
Full-stack feature	End-to-end: UI → API → DB → response → UI update
Data model change	Migration runs + CRUD operations verified through API or ORM
Bug fix	Reproduce the bug scenario, verify it no longer occurs
Config/settings	Verify the setting changes observable behavior
Refactor (no behavior change)	Existing tests pass + spot-check one behavior path

Mandatory Quality Pipeline

The reviewer and tester agents exist to be used — not decoratively. Every non-trivial feature must go through the quality pipeline. Skipping reviewers or testers to "save time" creates broken features that cost far more time to debug later.

Minimum Quality Requirements

Every feature gets a reviewer pass. No exceptions for "simple" features — the session transcript showed that even apparently simple features (like provider selection) had critical bugs that a reviewer would have caught.
Every feature with user-facing behavior gets a tester pass. The tester agent must be dispatched for any feature that a user would interact with. The tester validates functional behavior, not just code structure.
Features cannot be batch-validated. Each feature gets its own review → test cycle. "I'll review all 6 workstreams at the end" is not acceptable — bugs compound and become harder to diagnose.

The Lead Must Not Skip the Pipeline Under Time Pressure

Even when there are many features to implement, the quality pipeline is non-negotiable. It is better to ship 3 working features than 6 broken ones. If scope must be reduced to maintain quality, reduce scope — do not reduce quality.

Requirement Understanding Verification

Before implementing a feature, the lead must verify its understanding of what the user actually wants — especially for features involving:

User-facing behavior models (e.g., "the app should learn from my data" vs. "the user manually inputs preferences")
Implicit expectations (e.g., "show available providers" implies showing which ones are configured, not just listing all possible providers)
Domain-specific concepts (e.g., in a travel app, "preferences" might mean auto-learned travel patterns, not a settings form)

When in doubt, ask. A 30-second clarification prevents hours of rework on a fundamentally misunderstood feature.

This complements the Clarification Rule above — that rule covers ambiguous requirements; this rule covers requirements that seem clear but may be misunderstood. The test: "If I'm wrong about what this means, would I build something completely different?" If yes, verify.

12 KiB Raw Blame History