feat: add verification and debugging workflow skills

2026-03-11 11:27:40 +00:00
parent 57f577980e
commit e03234a0df
7 changed files with 497 additions and 0 deletions
--- a/.config/opencode/skills/systematic-debugging/SKILL.md
+++ b/.config/opencode/skills/systematic-debugging/SKILL.md
@@ -0,0 +1,92 @@
 ---
 name: systematic-debugging
 description: Use when encountering bugs, test failures, or unexpected behavior before proposing fixes
 permalink: opencode-config/skills/systematic-debugging/skill
 ---
 # Systematic Debugging
 ## Overview
 Random fix attempts create churn and often introduce new issues.
 **Core principle:** always identify root cause before attempting fixes.
 ## The Iron Law
 ```
 NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
 ```
 If Phase 1 is incomplete, do not propose or implement fixes.
 ## When to Use
 Use for any technical issue:
 - Test failures
 - Unexpected runtime behavior
 - Build or CI failures
 - Integration breakages
 - Performance regressions
 Use this especially when:
 - You are under time pressure
 - A "quick patch" seems obvious
 - Previous fix attempts did not work
 - You do not yet understand why the issue occurs
 ## Four-Phase Process
 Complete each phase in order.
 ### Phase 1: Root-Cause Investigation
 1. Read error messages and stack traces fully.
 2. Reproduce the issue reliably with exact steps.
 3. Check recent changes (code, config, dependency, environment).
 4. Gather evidence at component boundaries (inputs, outputs, config propagation).
 5. Trace data flow backward to the original trigger.
 For deeper tracing techniques, see `root-cause-tracing.md`.
 ### Phase 2: Pattern Analysis
 1. Find similar working code in the same repository.
 2. Compare broken and working paths line by line.
 3. List all differences, including small ones.
 4. Identify required dependencies and assumptions.
 ### Phase 3: Hypothesis and Minimal Testing
 1. State one concrete hypothesis: "X is failing because Y".
 2. Make the smallest possible change to test only that hypothesis.
 3. Verify result before making any additional changes.
 4. If the test fails, form a new hypothesis from new evidence.
 ### Phase 4: Fix and Verify
 1. Create a minimal failing reproduction (automated test when possible).
 2. Implement one fix targeting the identified root cause.
 3. Verify the issue is resolved and no regressions were introduced.
 4. If fix attempts keep failing, stop and reassess design assumptions.
 ## Red Flags (Stop and Restart at Phase 1)
 - "Let me try this quick fix first"
 - "I’ll batch several changes and see what works"
 - "It probably is X"
 - Proposing solutions before tracing the data flow
 - Continuing repeated fix attempts without new evidence
 ## Supporting Techniques
 Use these companion references while executing this process:
 - `root-cause-tracing.md` — trace failures backward through the call chain
 - `condition-based-waiting.md` — replace arbitrary sleeps with condition polling
 - `defense-in-depth.md` — add layered validation so recurrence is harder
 ## Related Skills
 - `test-driven-development` — build minimal failing tests and iterate safely
 - `verification-before-completion` — confirm behavior end-to-end before claiming done
--- a/.config/opencode/skills/systematic-debugging/condition-based-waiting.md
+++ b/.config/opencode/skills/systematic-debugging/condition-based-waiting.md
@@ -0,0 +1,68 @@
 ---
 title: condition-based-waiting
 type: note
 permalink: opencode-config/skills/systematic-debugging/condition-based-waiting
 ---
 # Condition-Based Waiting
 ## Overview
 Arbitrary sleep durations create flaky tests and race conditions.
 **Core principle:** wait for the condition that proves readiness, not a guessed delay.
 ## When to Use
 Use this when:
 - Tests rely on `sleep` or fixed `setTimeout` delays
 - Asynchronous operations complete at variable speeds
 - Tests pass locally but fail in CI or under load
 Avoid arbitrary waits except when explicitly validating timing behavior (for example, debounce intervals), and document why timing-based waiting is necessary.
 ## Core Pattern
 ```ts
 // ❌ Timing guess
 await new Promise((r) => setTimeout(r, 100));
 // ✅ Condition wait
 await waitFor(() => getState() === 'ready', 'state ready');
 ```
 ## Generic Helper
 ```ts
 async function waitFor<T>(
  condition: () => T | false | undefined | null,
  description: string,
  timeoutMs = 5000,
  pollMs = 10
 ): Promise<T> {
  const started = Date.now();
  while (true) {
    const result = condition();
    if (result) return result;
    if (Date.now() - started > timeoutMs) {
      throw new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`);
    }
    await new Promise((r) => setTimeout(r, pollMs));
  }
 }
 ```
 ## Practical Guidance
 - Keep polling intervals modest (for example, 10ms) to avoid hot loops.
 - Always include a timeout and actionable error message.
 - Query fresh state inside the loop; do not cache stale values outside it.
 ## Common Mistakes
 - Polling too aggressively (high CPU, little benefit)
 - Waiting forever without timeout
 - Mixing arbitrary delays and condition checks without rationale
--- a/.config/opencode/skills/systematic-debugging/defense-in-depth.md
+++ b/.config/opencode/skills/systematic-debugging/defense-in-depth.md
@@ -0,0 +1,64 @@
 ---
 title: defense-in-depth
 type: note
 permalink: opencode-config/skills/systematic-debugging/defense-in-depth
 ---
 # Defense in Depth
 ## Overview
 A single validation check can be bypassed by alternate paths, refactors, or test setup differences.
 **Core principle:** add validation at multiple layers so one missed check does not recreate the same failure.
 ## Layered Validation Model
 ### Layer 1: Entry Validation
 Reject obviously invalid input at boundaries (CLI/API/public methods).
 ### Layer 2: Business-Logic Validation
 Re-validate assumptions where operations are performed.
 ### Layer 3: Environment Guards
 Block dangerous operations in sensitive contexts (for example, test/runtime safety guards).
 ### Layer 4: Diagnostic Context
 Emit enough structured debug information to support future root-cause analysis.
 ## Applying the Pattern
 1. Trace real data flow from entry to failure.
 2. Mark all checkpoints where invalid state could be detected.
 3. Add targeted validation at each relevant layer.
 4. Verify each layer can catch invalid input independently.
 ## Example Shape
 ```ts
 function createWorkspace(path: string) {
  // Layer 1: entry
  if (!path || path.trim() === '') {
    throw new Error('path is required');
  }
  // Layer 2: operation-specific
  if (!isPathAllowed(path)) {
    throw new Error(`path not allowed: ${path}`);
  }
 }
 async function dangerousOperation(path: string) {
  // Layer 3: environment guard
  if (process.env.NODE_ENV === 'test' && !isSafeTestPath(path)) {
    throw new Error(`refusing unsafe path in test mode: ${path}`);
  }
  // Layer 4: diagnostic context
  console.error('operation context', { path, cwd: process.cwd(), stack: new Error().stack });
 }
 ```
 ## Key Outcome
 Root-cause fixes prevent recurrence at the origin. Layered validation reduces the chance that adjacent paths can reintroduce the same class of bug.
--- a/.config/opencode/skills/systematic-debugging/root-cause-tracing.md
+++ b/.config/opencode/skills/systematic-debugging/root-cause-tracing.md
@@ -0,0 +1,66 @@
 ---
 title: root-cause-tracing
 type: note
 permalink: opencode-config/skills/systematic-debugging/root-cause-tracing
 ---
 # Root-Cause Tracing
 ## Overview
 Many bugs appear deep in a stack trace, but the origin is often earlier in the call chain.
 **Core principle:** trace backward to the original trigger, then fix at the source.
 ## When to Use
 Use this when:
 - The symptom appears far from where bad input was introduced
 - The call chain spans multiple layers or components
 - You can see failure but cannot yet explain origin
 ## Tracing Process
 1. **Capture the symptom clearly**
   - Exact error text, stack frame, and context.
 2. **Find immediate failure point**
   - Identify the exact operation that throws or misbehaves.
 3. **Walk one frame up**
   - Determine who called it and with which values.
 4. **Repeat until source**
   - Continue tracing callers and values backward until you find where invalid state/data originated.
 5. **Fix at source**
   - Correct the earliest trigger rather than patching downstream symptoms.
 ## Instrumentation Tips
 When manual tracing is hard, add targeted instrumentation before the risky operation:
 ```ts
 const stack = new Error().stack;
 console.error('debug context', {
  input,
  cwd: process.cwd(),
  envMode: process.env.NODE_ENV,
  stack,
 });
 ```
 Guidelines:
 - Log before failure-prone operations, not after.
 - Include values that influence behavior.
 - Capture stack traces for call-path evidence.
 ## Common Mistake
 **Mistake:** fixing where the error appears because it is visible.
 **Better:** trace backward and fix where incorrect state is first introduced.
 ## Pair with Layered Defenses
 After fixing the source, apply layered validation from `defense-in-depth.md` so similar failures are blocked earlier in the future.
--- a/.config/opencode/skills/test-driven-development/SKILL.md
+++ b/.config/opencode/skills/test-driven-development/SKILL.md
@@ -0,0 +1,77 @@
 ---
 name: test-driven-development
 description: Enforce test-first development for features and bug fixes — no production
  code before a failing test
 permalink: opencode-config/skills/test-driven-development/skill
 ---
 # Test-Driven Development (TDD)
 ## When to Use
 Use this skill when implementing behavior changes:
 - New features
 - Bug fixes
 - Refactors that alter behavior
 If the work introduces or changes production behavior, TDD applies.
 ## Core Rule
 ```
 NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
 ```
 If production code was written first, delete or revert it and restart from a failing test.
 ## Red → Green → Refactor Loop
 ### 1) RED: Write one failing test
 - Write one small test that expresses the next expected behavior.
 - Prefer clear test names describing observable behavior.
 - Use real behavior paths where practical; mock only when isolation is required.
 ### 2) Verify RED (mandatory)
 Run the new test and confirm:
 - It fails (not just errors)
 - It fails for the expected reason
 - It fails because behavior is missing, not because the test is broken
 If it passes immediately, the test is not proving the new behavior. Fix the test first.
 ### 3) GREEN: Add minimal production code
 - Implement only enough code to make the failing test pass.
 - Do not add extra features, abstractions, or speculative options.
 ### 4) Verify GREEN (mandatory)
 Run the test suite scope needed for confidence:
 - New test passes
 - Related tests still pass
 If failures appear, fix production code first unless requirements changed.
 ### 5) REFACTOR
 - Improve names, remove duplication, and simplify structure.
 - Keep behavior unchanged.
 - Keep tests green throughout.
 Repeat for the next behavior.
 ## Quality Checks Before Completion
 - [ ] Each behavior change has a test that failed before implementation
 - [ ] New tests failed for the expected reason first
 - [ ] Production code was added only after RED was observed
 - [ ] Tests now pass cleanly
 - [ ] Edge cases for changed behavior are covered
 ## Practical Guardrails
 - "I'll write tests after" is not TDD.
 - Manual verification does not replace automated failing-then-passing tests.
 - If a test is hard to write, treat it as design feedback and simplify interfaces.
 - Keep test intent focused on behavior, not internals.
 ## Related Reference
 For common mistakes around mocks and test design, see [testing-anti-patterns](./testing-anti-patterns.md).
--- a/.config/opencode/skills/test-driven-development/testing-anti-patterns.md
+++ b/.config/opencode/skills/test-driven-development/testing-anti-patterns.md
@@ -0,0 +1,83 @@
 ---
 title: testing-anti-patterns
 type: note
 permalink: opencode-config/skills/test-driven-development/testing-anti-patterns
 ---
 # Testing Anti-Patterns
 Use this reference when writing/changing tests, introducing mocks, or considering test-only production APIs.
 ## Core Principle
 Test real behavior, not mock behavior.
 Mocks are isolation tools, not the subject under test.
 ## Anti-Pattern 1: Testing mock existence instead of behavior
 **Problem:** Assertions only prove a mock rendered or was called, not that business behavior is correct.
 **Fix:** Assert observable behavior of the unit/system under test. If possible, avoid mocking the component being validated.
 Gate check before assertions on mocked elements:
 - Am I validating system behavior or only that a mock exists?
 - If only mock existence, rewrite the test.
 ## Anti-Pattern 2: Adding test-only methods to production code
 **Problem:** Production classes gain methods used only by tests (cleanup hooks, debug helpers), polluting real APIs.
 **Fix:** Move test-only setup/cleanup into test utilities or fixtures.
 Gate check before adding a production method:
 - Is this method needed in production behavior?
 - Is this resource lifecycle actually owned by this class?
 - If not, keep it out of production code.
 ## Anti-Pattern 3: Mocking without understanding dependencies
 **Problem:** High-level mocks remove side effects the test depends on, causing false positives/negatives.
 **Fix:** Understand dependency flow first, then mock the lowest-cost external boundary while preserving needed behavior.
 Gate check before adding a mock:
 1. What side effects does the real method perform?
 2. Which side effects does this test rely on?
 3. Can I mock a lower-level boundary instead?
 If unsure, run against real implementation first, then add minimal mocking.
 ## Anti-Pattern 4: Incomplete mock structures
 **Problem:** Mocks include only fields used immediately, omitting fields consumed downstream.
 **Fix:** Mirror complete response/object shapes used in real flows.
 Gate check for mocked data:
 - Does this mock match the real schema/shape fully enough for downstream consumers?
 - If uncertain, include the full documented structure.
 ## Anti-Pattern 5: Treating tests as a follow-up phase
 **Problem:** "Implementation complete, tests later" breaks TDD and reduces confidence.
 **Fix:** Keep tests inside the implementation loop:
 1. Write failing test
 2. Implement minimum code
 3. Re-run tests
 4. Refactor safely
 ## Quick Red Flags
 - Assertions target `*-mock` markers rather than behavior outcomes
 - Methods exist only for tests in production classes
 - Mock setup dominates test logic
 - You cannot explain why each mock is necessary
 - Tests are written only after code "already works"
 ## Bottom Line
 If a test does not fail first for the intended reason, it is not validating the behavior change reliably.
 Keep TDD strict: failing test first, then minimal code.
--- a/.config/opencode/skills/verification-before-completion/SKILL.md
+++ b/.config/opencode/skills/verification-before-completion/SKILL.md
@@ -0,0 +1,47 @@
 ---
 name: verification-before-completion
 description: Require fresh verification evidence before any completion or success claim
 permalink: opencode-config/skills/verification-before-completion/skill
 ---
 ## When to Load
 Load this skill immediately before claiming work is complete, fixed, or passing.
 ## Core Rule
 ```
 NO COMPLETION OR SUCCESS CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
 ```
 If you did not run the relevant verification command for this change, do not claim success.
 ## Verification Gate
 Before any completion statement:
 1. **Identify** the exact command that proves the claim.
 2. **Run** the full command now (no cached or earlier output).
 3. **Check** exit code and output details (failure count, errors, warnings as relevant).
 4. **Report** the result with concrete evidence.
   - If verification fails, report failure status and next fix step.
   - If verification passes, state success and include proof.
 ## Common Proof Examples
 - **Tests pass** → fresh test run shows expected suite and zero failures.
 - **Lint is clean** → fresh lint run shows zero errors.
 - **Build succeeds** → fresh build run exits 0.
 - **Bug is fixed** → reproduction scenario now passes after the fix.
 - **Requirements are met** → checklist is re-verified against the implemented result.
 ## Anti-patterns
 - "Should pass" / "probably fixed" / "looks good"
 - claiming completion from partial checks
 - relying on old command output
 - trusting status reports without independent verification
 ## Bottom Line
 Run the right command, inspect the output, then make the claim.