feat: add verification and debugging workflow skills

2026-03-11 11:27:40 +00:00
parent 57f577980e
commit e03234a0df
7 changed files with 497 additions and 0 deletions
--- a/.config/opencode/skills/systematic-debugging/SKILL.md
+++ b/.config/opencode/skills/systematic-debugging/SKILL.md
@@ -0,0 +1,92 @@
+---
+name: systematic-debugging
+description: Use when encountering bugs, test failures, or unexpected behavior before proposing fixes
+permalink: opencode-config/skills/systematic-debugging/skill
+---
+
+# Systematic Debugging
+
+## Overview
+
+Random fix attempts create churn and often introduce new issues.
+
+**Core principle:** always identify root cause before attempting fixes.
+
+## The Iron Law
+
+```
+NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
+```
+
+If Phase 1 is incomplete, do not propose or implement fixes.
+
+## When to Use
+
+Use for any technical issue:
+- Test failures
+- Unexpected runtime behavior
+- Build or CI failures
+- Integration breakages
+- Performance regressions
+
+Use this especially when:
+- You are under time pressure
+- A "quick patch" seems obvious
+- Previous fix attempts did not work
+- You do not yet understand why the issue occurs
+
+## Four-Phase Process
+
+Complete each phase in order.
+
+### Phase 1: Root-Cause Investigation
+
+1. Read error messages and stack traces fully.
+2. Reproduce the issue reliably with exact steps.
+3. Check recent changes (code, config, dependency, environment).
+4. Gather evidence at component boundaries (inputs, outputs, config propagation).
+5. Trace data flow backward to the original trigger.
+
+For deeper tracing techniques, see `root-cause-tracing.md`.
+
+### Phase 2: Pattern Analysis
+
+1. Find similar working code in the same repository.
+2. Compare broken and working paths line by line.
+3. List all differences, including small ones.
+4. Identify required dependencies and assumptions.
+
+### Phase 3: Hypothesis and Minimal Testing
+
+1. State one concrete hypothesis: "X is failing because Y".
+2. Make the smallest possible change to test only that hypothesis.
+3. Verify result before making any additional changes.
+4. If the test fails, form a new hypothesis from new evidence.
+
+### Phase 4: Fix and Verify
+
+1. Create a minimal failing reproduction (automated test when possible).
+2. Implement one fix targeting the identified root cause.
+3. Verify the issue is resolved and no regressions were introduced.
+4. If fix attempts keep failing, stop and reassess design assumptions.
+
+## Red Flags (Stop and Restart at Phase 1)
+
+- "Let me try this quick fix first"
+- "I’ll batch several changes and see what works"
+- "It probably is X"
+- Proposing solutions before tracing the data flow
+- Continuing repeated fix attempts without new evidence
+
+## Supporting Techniques
+
+Use these companion references while executing this process:
+
+- `root-cause-tracing.md` — trace failures backward through the call chain
+- `condition-based-waiting.md` — replace arbitrary sleeps with condition polling
+- `defense-in-depth.md` — add layered validation so recurrence is harder
+
+## Related Skills
+
+- `test-driven-development` — build minimal failing tests and iterate safely
+- `verification-before-completion` — confirm behavior end-to-end before claiming done
--- a/.config/opencode/skills/systematic-debugging/condition-based-waiting.md
+++ b/.config/opencode/skills/systematic-debugging/condition-based-waiting.md
@@ -0,0 +1,68 @@
+---
+title: condition-based-waiting
+type: note
+permalink: opencode-config/skills/systematic-debugging/condition-based-waiting
+---
+
+# Condition-Based Waiting
+
+## Overview
+
+Arbitrary sleep durations create flaky tests and race conditions.
+
+**Core principle:** wait for the condition that proves readiness, not a guessed delay.
+
+## When to Use
+
+Use this when:
+- Tests rely on `sleep` or fixed `setTimeout` delays
+- Asynchronous operations complete at variable speeds
+- Tests pass locally but fail in CI or under load
+
+Avoid arbitrary waits except when explicitly validating timing behavior (for example, debounce intervals), and document why timing-based waiting is necessary.
+
+## Core Pattern
+
+```ts
+// ❌ Timing guess
+await new Promise((r) => setTimeout(r, 100));
+
+// ✅ Condition wait
+await waitFor(() => getState() === 'ready', 'state ready');
+```
+
+## Generic Helper
+
+```ts
+async function waitFor<T>(
+  condition: () => T | false | undefined | null,
+  description: string,
+  timeoutMs = 5000,
+  pollMs = 10
+): Promise<T> {
+  const started = Date.now();
+
+  while (true) {
+    const result = condition();
+    if (result) return result;
+
+    if (Date.now() - started > timeoutMs) {
+      throw new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`);
+    }
+
+    await new Promise((r) => setTimeout(r, pollMs));
+  }
+}
+```
+
+## Practical Guidance
+
+- Keep polling intervals modest (for example, 10ms) to avoid hot loops.
+- Always include a timeout and actionable error message.
+- Query fresh state inside the loop; do not cache stale values outside it.
+
+## Common Mistakes
+
+- Polling too aggressively (high CPU, little benefit)
+- Waiting forever without timeout
+- Mixing arbitrary delays and condition checks without rationale
--- a/.config/opencode/skills/systematic-debugging/defense-in-depth.md
+++ b/.config/opencode/skills/systematic-debugging/defense-in-depth.md
@@ -0,0 +1,64 @@
+---
+title: defense-in-depth
+type: note
+permalink: opencode-config/skills/systematic-debugging/defense-in-depth
+---
+
+# Defense in Depth
+
+## Overview
+
+A single validation check can be bypassed by alternate paths, refactors, or test setup differences.
+
+**Core principle:** add validation at multiple layers so one missed check does not recreate the same failure.
+
+## Layered Validation Model
+
+### Layer 1: Entry Validation
+Reject obviously invalid input at boundaries (CLI/API/public methods).
+
+### Layer 2: Business-Logic Validation
+Re-validate assumptions where operations are performed.
+
+### Layer 3: Environment Guards
+Block dangerous operations in sensitive contexts (for example, test/runtime safety guards).
+
+### Layer 4: Diagnostic Context
+Emit enough structured debug information to support future root-cause analysis.
+
+## Applying the Pattern
+
+1. Trace real data flow from entry to failure.
+2. Mark all checkpoints where invalid state could be detected.
+3. Add targeted validation at each relevant layer.
+4. Verify each layer can catch invalid input independently.
+
+## Example Shape
+
+```ts
+function createWorkspace(path: string) {
+  // Layer 1: entry
+  if (!path || path.trim() === '') {
+    throw new Error('path is required');
+  }
+
+  // Layer 2: operation-specific
+  if (!isPathAllowed(path)) {
+    throw new Error(`path not allowed: ${path}`);
+  }
+}
+
+async function dangerousOperation(path: string) {
+  // Layer 3: environment guard
+  if (process.env.NODE_ENV === 'test' && !isSafeTestPath(path)) {
+    throw new Error(`refusing unsafe path in test mode: ${path}`);
+  }
+
+  // Layer 4: diagnostic context
+  console.error('operation context', { path, cwd: process.cwd(), stack: new Error().stack });
+}
+```
+
+## Key Outcome
+
+Root-cause fixes prevent recurrence at the origin. Layered validation reduces the chance that adjacent paths can reintroduce the same class of bug.
--- a/.config/opencode/skills/systematic-debugging/root-cause-tracing.md
+++ b/.config/opencode/skills/systematic-debugging/root-cause-tracing.md
@@ -0,0 +1,66 @@
+---
+title: root-cause-tracing
+type: note
+permalink: opencode-config/skills/systematic-debugging/root-cause-tracing
+---
+
+# Root-Cause Tracing
+
+## Overview
+
+Many bugs appear deep in a stack trace, but the origin is often earlier in the call chain.
+
+**Core principle:** trace backward to the original trigger, then fix at the source.
+
+## When to Use
+
+Use this when:
+- The symptom appears far from where bad input was introduced
+- The call chain spans multiple layers or components
+- You can see failure but cannot yet explain origin
+
+## Tracing Process
+
+1. **Capture the symptom clearly**
+   - Exact error text, stack frame, and context.
+
+2. **Find immediate failure point**
+   - Identify the exact operation that throws or misbehaves.
+
+3. **Walk one frame up**
+   - Determine who called it and with which values.
+
+4. **Repeat until source**
+   - Continue tracing callers and values backward until you find where invalid state/data originated.
+
+5. **Fix at source**
+   - Correct the earliest trigger rather than patching downstream symptoms.
+
+## Instrumentation Tips
+
+When manual tracing is hard, add targeted instrumentation before the risky operation:
+
+```ts
+const stack = new Error().stack;
+console.error('debug context', {
+  input,
+  cwd: process.cwd(),
+  envMode: process.env.NODE_ENV,
+  stack,
+});
+```
+
+Guidelines:
+- Log before failure-prone operations, not after.
+- Include values that influence behavior.
+- Capture stack traces for call-path evidence.
+
+## Common Mistake
+
+**Mistake:** fixing where the error appears because it is visible.
+
+**Better:** trace backward and fix where incorrect state is first introduced.
+
+## Pair with Layered Defenses
+
+After fixing the source, apply layered validation from `defense-in-depth.md` so similar failures are blocked earlier in the future.
--- a/.config/opencode/skills/test-driven-development/SKILL.md
+++ b/.config/opencode/skills/test-driven-development/SKILL.md
@@ -0,0 +1,77 @@
+---
+name: test-driven-development
+description: Enforce test-first development for features and bug fixes — no production
+  code before a failing test
+permalink: opencode-config/skills/test-driven-development/skill
+---
+
+# Test-Driven Development (TDD)
+
+## When to Use
+
+Use this skill when implementing behavior changes:
+- New features
+- Bug fixes
+- Refactors that alter behavior
+
+If the work introduces or changes production behavior, TDD applies.
+
+## Core Rule
+
+```
+NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
+```
+
+If production code was written first, delete or revert it and restart from a failing test.
+
+## Red → Green → Refactor Loop
+
+### 1) RED: Write one failing test
+- Write one small test that expresses the next expected behavior.
+- Prefer clear test names describing observable behavior.
+- Use real behavior paths where practical; mock only when isolation is required.
+
+### 2) Verify RED (mandatory)
+Run the new test and confirm:
+- It fails (not just errors)
+- It fails for the expected reason
+- It fails because behavior is missing, not because the test is broken
+
+If it passes immediately, the test is not proving the new behavior. Fix the test first.
+
+### 3) GREEN: Add minimal production code
+- Implement only enough code to make the failing test pass.
+- Do not add extra features, abstractions, or speculative options.
+
+### 4) Verify GREEN (mandatory)
+Run the test suite scope needed for confidence:
+- New test passes
+- Related tests still pass
+
+If failures appear, fix production code first unless requirements changed.
+
+### 5) REFACTOR
+- Improve names, remove duplication, and simplify structure.
+- Keep behavior unchanged.
+- Keep tests green throughout.
+
+Repeat for the next behavior.
+
+## Quality Checks Before Completion
+
+- [ ] Each behavior change has a test that failed before implementation
+- [ ] New tests failed for the expected reason first
+- [ ] Production code was added only after RED was observed
+- [ ] Tests now pass cleanly
+- [ ] Edge cases for changed behavior are covered
+
+## Practical Guardrails
+
+- "I'll write tests after" is not TDD.
+- Manual verification does not replace automated failing-then-passing tests.
+- If a test is hard to write, treat it as design feedback and simplify interfaces.
+- Keep test intent focused on behavior, not internals.
+
+## Related Reference
+
+For common mistakes around mocks and test design, see [testing-anti-patterns](./testing-anti-patterns.md).
--- a/.config/opencode/skills/test-driven-development/testing-anti-patterns.md
+++ b/.config/opencode/skills/test-driven-development/testing-anti-patterns.md
@@ -0,0 +1,83 @@
+---
+title: testing-anti-patterns
+type: note
+permalink: opencode-config/skills/test-driven-development/testing-anti-patterns
+---
+
+# Testing Anti-Patterns
+
+Use this reference when writing/changing tests, introducing mocks, or considering test-only production APIs.
+
+## Core Principle
+
+Test real behavior, not mock behavior.
+
+Mocks are isolation tools, not the subject under test.
+
+## Anti-Pattern 1: Testing mock existence instead of behavior
+
+**Problem:** Assertions only prove a mock rendered or was called, not that business behavior is correct.
+
+**Fix:** Assert observable behavior of the unit/system under test. If possible, avoid mocking the component being validated.
+
+Gate check before assertions on mocked elements:
+- Am I validating system behavior or only that a mock exists?
+- If only mock existence, rewrite the test.
+
+## Anti-Pattern 2: Adding test-only methods to production code
+
+**Problem:** Production classes gain methods used only by tests (cleanup hooks, debug helpers), polluting real APIs.
+
+**Fix:** Move test-only setup/cleanup into test utilities or fixtures.
+
+Gate check before adding a production method:
+- Is this method needed in production behavior?
+- Is this resource lifecycle actually owned by this class?
+- If not, keep it out of production code.
+
+## Anti-Pattern 3: Mocking without understanding dependencies
+
+**Problem:** High-level mocks remove side effects the test depends on, causing false positives/negatives.
+
+**Fix:** Understand dependency flow first, then mock the lowest-cost external boundary while preserving needed behavior.
+
+Gate check before adding a mock:
+1. What side effects does the real method perform?
+2. Which side effects does this test rely on?
+3. Can I mock a lower-level boundary instead?
+
+If unsure, run against real implementation first, then add minimal mocking.
+
+## Anti-Pattern 4: Incomplete mock structures
+
+**Problem:** Mocks include only fields used immediately, omitting fields consumed downstream.
+
+**Fix:** Mirror complete response/object shapes used in real flows.
+
+Gate check for mocked data:
+- Does this mock match the real schema/shape fully enough for downstream consumers?
+- If uncertain, include the full documented structure.
+
+## Anti-Pattern 5: Treating tests as a follow-up phase
+
+**Problem:** "Implementation complete, tests later" breaks TDD and reduces confidence.
+
+**Fix:** Keep tests inside the implementation loop:
+1. Write failing test
+2. Implement minimum code
+3. Re-run tests
+4. Refactor safely
+
+## Quick Red Flags
+
+- Assertions target `*-mock` markers rather than behavior outcomes
+- Methods exist only for tests in production classes
+- Mock setup dominates test logic
+- You cannot explain why each mock is necessary
+- Tests are written only after code "already works"
+
+## Bottom Line
+
+If a test does not fail first for the intended reason, it is not validating the behavior change reliably.
+
+Keep TDD strict: failing test first, then minimal code.
--- a/.config/opencode/skills/verification-before-completion/SKILL.md
+++ b/.config/opencode/skills/verification-before-completion/SKILL.md
@@ -0,0 +1,47 @@
+---
+name: verification-before-completion
+description: Require fresh verification evidence before any completion or success claim
+permalink: opencode-config/skills/verification-before-completion/skill
+---
+
+## When to Load
+
+Load this skill immediately before claiming work is complete, fixed, or passing.
+
+## Core Rule
+
+```
+NO COMPLETION OR SUCCESS CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
+```
+
+If you did not run the relevant verification command for this change, do not claim success.
+
+## Verification Gate
+
+Before any completion statement:
+
+1. **Identify** the exact command that proves the claim.
+2. **Run** the full command now (no cached or earlier output).
+3. **Check** exit code and output details (failure count, errors, warnings as relevant).
+4. **Report** the result with concrete evidence.
+   - If verification fails, report failure status and next fix step.
+   - If verification passes, state success and include proof.
+
+## Common Proof Examples
+
+- **Tests pass** → fresh test run shows expected suite and zero failures.
+- **Lint is clean** → fresh lint run shows zero errors.
+- **Build succeeds** → fresh build run exits 0.
+- **Bug is fixed** → reproduction scenario now passes after the fix.
+- **Requirements are met** → checklist is re-verified against the implemented result.
+
+## Anti-patterns
+
+- "Should pass" / "probably fixed" / "looks good"
+- claiming completion from partial checks
+- relying on old command output
+- trusting status reports without independent verification
+
+## Bottom Line
+
+Run the right command, inspect the output, then make the claim.