## Project Memory Use markdown files in `.memory/` as the persistent project memory across sessions. This is the source of truth for architecture, decisions, plans, research, and implementation state. **Directory structure:** ```text .memory/ knowledge.md # Persistent project knowledge (architecture, patterns, key concepts) decisions.md # Architecture decisions, SME guidance, design choices plans/ # One file per active plan/feature .md # Plan with tasks, statuses, acceptance criteria research/ # Research findings .md # Research on a specific topic ``` **Workflow: read files → work → update files** 1. **Session start:** Read `.memory/` directory contents and skim `.memory/knowledge.md`. 2. **Before each task:** Read relevant `.memory/*.md` files before reading source files for project understanding. 3. **After each task:** Update the appropriate `.memory/*.md` files with what was built. Be specific in summaries: include parameter names, defaults, file locations, and rationale. Keep concepts organized as markdown sections (`## Heading`) and keep hierarchy shallow. **Recording discipline:** Only record outcomes, decisions, and discoveries — never phase transitions, status changes, or ceremony checkpoints. If an entry would only say "we started phase X", don't add it. Memory files preserve *knowledge*, not *activity logs*. **Read discipline:** - Read only the `.memory/` files relevant to the current task; avoid broad re-reads that add no new signal. - **Skip redundant reads** when `.memory/` already has no relevant content in that domain this session. - **Do not immediately re-read content you just wrote.** You already have that context from the update. - Treat `.memory/` as a **tool**, not a ritual. Every read should have a specific information need. **Linking is required.** When recording related knowledge across files, add markdown cross-references (for example: `See [Decision: Auth](decisions.md#auth-approach)`). A section with no references becomes a dead end. ## Cross-Tool Instruction Files Use symlinks to share this instruction file across all agentic coding tools: ```text project/ ├── AGENTS.md # Real file (edit this one) ├── CLAUDE.md -> AGENTS.md ├── .cursorrules -> AGENTS.md └── .github/ └── copilot-instructions.md -> ../AGENTS.md ``` **Rules:** - Edit `AGENTS.md` — changes propagate automatically via symlinks - Never edit symlinked files directly (changes would be lost) - Symlinks are committed to git (git tracks them natively) **Content of this file:** - Project overview and purpose - Tech stack and architecture - Coding conventions and patterns - Build/test/lint commands - Project structure overview **Do NOT duplicate `.memory/` contents** — instruction files describe how to work with the project, not active plans, research, or decisions. **When initializing a project:** 1. Create `AGENTS.md` with project basics 2. Create symlinks: `ln -s AGENTS.md CLAUDE.md`, `ln -s AGENTS.md .cursorrules`, `ln -s ../AGENTS.md .github/copilot-instructions.md` 3. Commit the real file and symlinks to git **When joining an existing project:** - Read `AGENTS.md` (or any of the symlinked files) to understand the project - If instruction file is missing, create it and the symlinks ## Session Continuity - Treat `.memory/` files as the persistent tracking system for work across sessions. - At session start, identify prior in-progress work items and pending decisions before doing new implementation. - After implementation, update `.memory/` files with what changed, why it changed, and what remains next. ## Clarification Rule - If requirements are genuinely unclear, materially ambiguous, or have multiple valid interpretations that would lead to **materially different implementations**, use the `question` tool to clarify before committing to an implementation path. - **Do not ask for clarification when the user's intent is obvious.** If the user explicitly states what they want (e.g., "update X and also update Y"), do not ask "should I do both?" — proceed with the stated request. - Implementation-level decisions (naming, file organization, approach) are the agent's job, not the user's. Only escalate decisions that affect **user-visible behavior or scope**. ## Agent Roster | Agent | Role | Model | |---|---|---| | `lead` | Primary orchestrator that decomposes work, delegates, and synthesizes outcomes. | `github-copilot/claude-opus-4` (global default) | | `coder` | Implementation-focused coding agent for reliable code changes. | `github-copilot/gpt-5.3-codex` | | `reviewer` | Read-only code/source review; writes `.memory/*` for verdict records. | `github-copilot/claude-opus-4.6` | | `tester` | Validation agent for standard + adversarial testing; writes `.memory/*` for test outcomes. | `github-copilot/claude-sonnet-4.6` | | `explorer` | Fast read-only codebase mapper; writes `.memory/*` for discovery records. | `github-copilot/claude-sonnet-4.6` | | `researcher` | Deep technical investigator; writes `.memory/*` for research findings. | `github-copilot/claude-opus-4.6` | | `librarian` | Documentation coverage and accuracy specialist. | `github-copilot/claude-opus-4.6` | | `critic` | Pre-implementation gate and blocker sounding board; writes `.memory/*` for verdicts. | `github-copilot/claude-opus-4.6` | | `sme` | Subject-matter expert for domain-specific consultation; writes `.memory/*` for guidance cache. | `github-copilot/claude-opus-4.6` | | `designer` | UI/UX specialist for interaction and visual guidance; writes `.memory/*` for design decisions. | `github-copilot/claude-sonnet-4.6` | All agents except `lead`, `coder`, and `librarian` are code/source read-only but have `permission.edit: allow` scoped to `.memory/*` writes for their recording duties. The `lead` and `librarian` have full edit access; `coder` has full edit access for implementation. ## Parallelization - **Always parallelize independent work.** Any tool calls that do not depend on each other's output must be issued in the same message as parallel calls — never sequentially. This applies to bash commands, file reads, and subagent delegations alike. - Before issuing a sequence of calls, ask: *"Does call B require the result of call A?"* If not, send them together. ## Human Checkpoint Triggers When implementing features, the Lead must stop and request explicit user approval before dispatching coder work in these situations: 1. **Security-sensitive design**: Any feature involving encryption, auth flows, secret storage, token management, or permission model changes. 2. **Architectural ambiguity**: Multiple valid approaches with materially different tradeoffs that aren't resolvable from codebase conventions alone. 3. **Vision-dependent features**: Features where the user's intended UX or behavior model isn't fully specified by the request. 4. **New external dependencies**: Adding a service, SDK, or infrastructure component not already in the project. 5. **Data model changes with migration impact**: Schema changes affecting existing production data. The checkpoint must present the specific decision, 2-3 concrete options with tradeoffs, a recommendation, and a safe default. Implementation-level decisions (naming, file organization, code patterns) are NOT checkpoints — only user-visible behavior and architectural choices qualify. ## Functional Verification (Implement → Verify → Iterate) **Static analysis is not verification.** Type checks (`bun run check`, `tsc`), linters (`eslint`, `ruff`), and framework system checks (`python manage.py check`) confirm code is syntactically and structurally valid. They do NOT confirm the feature works. A feature that type-checks perfectly can be completely non-functional. **Every implemented feature MUST be functionally verified before being marked complete.** "Functionally verified" means demonstrating that the feature actually works end-to-end — not just that it compiles. ### What Counts as Functional Verification Functional verification must exercise the **actual behavior path** a user would trigger: - **API endpoints**: Make real HTTP requests (`curl`, `httpie`, or the app's test client) and verify response status, shape, and data correctness. Check both success and error paths. - **Frontend components**: Verify the component renders, interacts correctly, and communicates with the backend. Use the browser (Playwright) or run the app's frontend test suite. - **Database/model changes**: Verify migrations run, data can be created/read/updated/deleted through the ORM or API, and constraints are enforced. - **Integration points**: When a feature spans frontend ↔ backend, verify the full round-trip: UI action → API call → database → response → UI update. - **Configuration/settings**: Verify the setting is actually read and affects behavior — not just that the config key exists. ### What Does NOT Count as Functional Verification These are useful but insufficient on their own: - ❌ `bun run check` / `tsc --noEmit` (type checking) - ❌ `bun run lint` / `eslint` / `ruff` (linting) - ❌ `python manage.py check` (Django system checks) - ❌ `bun run build` succeeding (build pipeline) - ❌ Reading the code and concluding "this looks correct" - ❌ Verifying file existence or import structure ### The Iterate-Until-Working Cycle When functional verification reveals a problem: 1. **Diagnose** the root cause (not just the symptom). 2. **Fix** via coder dispatch with the specific failure context. 3. **Re-verify** the same functional test that failed. 4. **Repeat** until the feature demonstrably works. A feature is "done" when it passes functional verification, not when the coder returns without errors. The lead agent must never mark a task complete based solely on a clean coder return — the verification step is mandatory. ### Verification Scope by Change Type | Change type | Minimum verification | |---|---| | New API endpoint | HTTP request with expected response verified | | New UI feature | Browser-based or test-suite verification of render + interaction | | Full-stack feature | End-to-end: UI → API → DB → response → UI update | | Data model change | Migration runs + CRUD operations verified through API or ORM | | Bug fix | Reproduce the bug scenario, verify it no longer occurs | | Config/settings | Verify the setting changes observable behavior | | Refactor (no behavior change) | Existing tests pass + spot-check one behavior path | ## Mandatory Quality Pipeline **The reviewer and tester agents exist to be used — not decoratively.** Every non-trivial feature must go through the quality pipeline. Skipping reviewers or testers to "save time" creates broken features that cost far more time to debug later. ### Minimum Quality Requirements - **Every feature gets a reviewer pass.** No exceptions for "simple" features — the session transcript showed that even apparently simple features (like provider selection) had critical bugs that a reviewer would have caught. - **Every feature with user-facing behavior gets a tester pass.** The tester agent must be dispatched for any feature that a user would interact with. The tester validates functional behavior, not just code structure. - **Features cannot be batch-validated.** Each feature gets its own review → test cycle. "I'll review all 6 workstreams at the end" is not acceptable — bugs compound and become harder to diagnose. ### The Lead Must Not Skip the Pipeline Under Time Pressure Even when there are many features to implement, the quality pipeline is non-negotiable. It is better to ship 3 working features than 6 broken ones. If scope must be reduced to maintain quality, reduce scope — do not reduce quality. ## Requirement Understanding Verification Before implementing a feature, the lead must verify its understanding of what the user actually wants — especially for features involving: - **User-facing behavior models** (e.g., "the app should learn from my data" vs. "the user manually inputs preferences") - **Implicit expectations** (e.g., "show available providers" implies showing which ones are *configured*, not just listing all possible providers) - **Domain-specific concepts** (e.g., in a travel app, "preferences" might mean auto-learned travel patterns, not a settings form) When in doubt, ask. A 30-second clarification prevents hours of rework on a fundamentally misunderstood feature. This complements the Clarification Rule above — that rule covers *ambiguous requirements*; this rule covers *requirements that seem clear but may be misunderstood*. The test: "If I'm wrong about what this means, would I build something completely different?" If yes, verify.