Merge PR #137: feat: Add Superpowers software development skills

Authored by kaos35. Adds 5 software development workflow skills adapted from obra/superpowers: test-driven-development, systematic-debugging, subagent-driven-development, writing-plans, requesting-code-review.
2026-03-03 04:00:00 -08:00 · 2026-03-03 04:00:00 -08:00 · 0e1723ef74
commit 0e1723ef74
parent 4f5ffb8909 2595d81733
5 changed files with 2048 additions and 0 deletions
--- a/skills/software-development/requesting-code-review/SKILL.md
+++ b/skills/software-development/requesting-code-review/SKILL.md
@ -0,0 +1,393 @@
 ---
 name: requesting-code-review
 description: Use when completing tasks, implementing major features, or before merging. Validates work meets requirements through systematic review process.
 version: 1.0.0
 author: Hermes Agent (adapted from Superpowers)
 license: MIT
 metadata:
  hermes:
    tags: [code-review, quality, validation, workflow, review]
    related_skills: [subagent-driven-development, writing-plans, test-driven-development]
 ---
 # Requesting Code Review
 ## Overview
 Systematic code review catches issues before they cascade. Review early, review often.
 **Core principle:** Fresh perspective finds issues you'll miss.
 ## When to Request Review
 **Mandatory Reviews:**
 - After each task in subagent-driven development
 - After completing major features
 - Before merge to main
 - After bug fixes
 **Optional but Valuable:**
 - When stuck (fresh perspective)
 - Before refactoring (baseline check)
 - After complex logic implementation
 - When touching critical code (auth, payments, data)
 **Don't skip because:**
 - "It's simple" (simple bugs compound)
 - "I'm in a hurry" (reviews save time)
 - "I tested it" (you have blind spots)
 ## Review Process
 ### Step 1: Prepare Context
 Gather:
 - What was implemented
 - Original requirements/plan
 - Files changed
 - Test results
 ```bash
 # Get changed files
 git diff --name-only HEAD~1
 # Get diff summary
 git diff --stat HEAD~1
 # Get commit messages
 git log --oneline HEAD~5
 ```
 ### Step 2: Self-Review First
 Before requesting external review:
 **Checklist:**
 - [ ] Code follows project conventions
 - [ ] All tests pass
 - [ ] No debug print statements
 - [ ] No hardcoded secrets
 - [ ] Error handling in place
 - [ ] Documentation updated
 - [ ] Commit messages are clear
 ```bash
 # Run full test suite
 pytest
 # Check for debug code
 grep -r "print(" src/ --include="*.py"
 grep -r "debugger" src/ --include="*.js"
 # Check for TODOs
 grep -r "TODO" src/ --include="*.py"
 ```
 ### Step 3: Request Review
 Use `delegate_task` to dispatch a reviewer subagent:
 ```python
 delegate_task(
    goal="Review implementation for quality and correctness",
    context="""
    WHAT WAS IMPLEMENTED: [Brief description]
    ORIGINAL REQUIREMENTS: [From plan or issue]
    FILES CHANGED:
    - src/feature.py (added X function)
    - tests/test_feature.py (added tests)
    COMMIT RANGE: [SHA range or branch]
    CHECK FOR:
    - Correctness (does it do what it should?)
    - Edge cases handled?
    - Error handling adequate?
    - Code quality and style
    - Test coverage
    - Security issues
    - Performance concerns
    OUTPUT FORMAT:
    - Summary: [brief assessment]
    - Critical Issues: [must fix]
    - Important Issues: [should fix]
    - Minor Issues: [nice to have]
    - Verdict: [APPROVE / REQUEST_CHANGES / NEEDS_WORK]
    """,
    toolsets=['file']
 )
 ```
 ### Step 4: Act on Feedback
 **Critical Issues (Block merge):**
 - Security vulnerabilities
 - Broken functionality
 - Data loss risk
 - Test failures
 **Action:** Fix immediately before proceeding
 **Important Issues (Should fix):**
 - Missing edge case handling
 - Poor error messages
 - Unclear code
 - Missing tests
 **Action:** Fix before merge if possible
 **Minor Issues (Nice to have):**
 - Style preferences
 - Refactoring suggestions
 - Documentation improvements
 **Action:** Note for later or quick fix
 ## Review Dimensions
 ### Correctness
 **Questions:**
 - Does it implement the requirements?
 - Are there logic errors?
 - Do edge cases work?
 - Are there race conditions?
 **Check:**
 - Read implementation against requirements
 - Trace through edge cases
 - Check boundary conditions
 ### Code Quality
 **Questions:**
 - Is code readable?
 - Are names clear?
 - Is it too complex?
 - Is there duplication?
 **Check:**
 - Function length (aim <20 lines)
 - Cyclomatic complexity
 - DRY violations
 - Naming clarity
 ### Testing
 **Questions:**
 - Are there tests?
 - Do they cover edge cases?
 - Are they meaningful?
 - Do they pass?
 **Check:**
 - Test coverage
 - Edge case coverage
 - Test clarity
 - Assertion quality
 ### Security
 **Questions:**
 - Any injection vulnerabilities?
 - Proper input validation?
 - Secrets handled correctly?
 - Access control in place?
 **Check:**
 - Input sanitization
 - Authentication/authorization
 - Secret management
 - SQL/query safety
 ### Performance
 **Questions:**
 - Any N+1 queries?
 - Unnecessary computation?
 - Memory leaks?
 - Scalability concerns?
 **Check:**
 - Database queries
 - Algorithm complexity
 - Resource usage
 - Caching opportunities
 ## Review Output Format
 Standard review format:
 ```markdown
 ## Review Summary
 **Assessment:** [Brief overall assessment]
 **Verdict:** [APPROVE / REQUEST_CHANGES / NEEDS_WORK]
 ---
 ## Critical Issues (Fix Required)
 1. **[Issue title]**
   - Location: `file.py:45`
   - Problem: [Description]
   - Suggestion: [How to fix]
 ---
 ## Important Issues (Should Fix)
 1. **[Issue title]**
   - Location: `file.py:67`
   - Problem: [Description]
   - Suggestion: [How to fix]
 ---
 ## Minor Issues (Optional)
 1. **[Issue title]**
   - Suggestion: [Improvement idea]
 ---
 ## Strengths
 - [What was done well]
 ```
 ## Integration with Other Skills
 ### With subagent-driven-development
 Review after EACH task:
 1. Subagent implements task
 2. Request code review
 3. Fix issues
 4. Proceed to next task
 ### With test-driven-development
 Review checks:
 - Tests written first?
 - Tests are meaningful?
 - Edge cases covered?
 - All tests pass?
 ### With writing-plans
 Review validates:
 - Implementation matches plan?
 - All tasks completed?
 - Quality standards met?
 ## Common Review Patterns
 ### Pre-Merge Review
 Before merging feature branch:
 ```bash
 # Create review checkpoint
 git log --oneline main..feature-branch
 # Get summary of changes
 git diff --stat main..feature-branch
 # Request review
 delegate_task(
    goal="Pre-merge review of feature branch",
    context="[changes, requirements, test results]"
 )
 # Address feedback
 # Merge when approved
 ```
 ### Continuous Review
 During subagent-driven development:
 ```python
 # After each task
 if task_complete:
    review_result = request_review(task)
    if review_result.has_critical_issues():
        fix_issues(review_result.critical)
        re_review()
    proceed_to_next_task()
 ```
 ### Emergency Review
 When fixing production bugs:
 1. Fix with tests
 2. Self-review
 3. Quick peer review
 4. Deploy
 5. Full review post-deploy
 ## Review Best Practices
 ### As Review Requester
 **Do:**
 - Provide complete context
 - Highlight areas of concern
 - Ask specific questions
 - Be responsive to feedback
 - Fix issues promptly
 **Don't:**
 - Rush the reviewer
 - Argue without evidence
 - Ignore feedback
 - Take criticism personally
 ### As Reviewer (via subagent)
 **Do:**
 - Be specific about issues
 - Suggest improvements
 - Acknowledge what works
 - Prioritize issues
 **Don't:**
 - Nitpick style (unless project requires)
 - Make vague comments
 - Block without explanation
 - Be overly critical
 ## Quality Gates
 **Must pass before merge:**
 - [ ] No critical issues
 - [ ] All tests pass
 - [ ] Review approved
 - [ ] Requirements met
 **Should pass before merge:**
 - [ ] No important issues
 - [ ] Documentation updated
 - [ ] Performance acceptable
 **Nice to have:**
 - [ ] No minor issues
 - [ ] Extra polish
 ## Remember
 ```
 Review early
 Review often
 Be specific
 Fix critical issues first
 Quality over speed
 ```
 **A good review catches what you missed.**
--- a/skills/software-development/subagent-driven-development/SKILL.md
+++ b/skills/software-development/subagent-driven-development/SKILL.md
@ -0,0 +1,375 @@
 ---
 name: subagent-driven-development
 description: Use when executing implementation plans with independent tasks. Dispatches fresh delegate_task per task with two-stage review (spec compliance then code quality).
 version: 1.0.0
 author: Hermes Agent (adapted from Superpowers)
 license: MIT
 metadata:
  hermes:
    tags: [delegation, subagent, implementation, workflow, parallel]
    related_skills: [writing-plans, requesting-code-review, test-driven-development]
 ---
 # Subagent-Driven Development
 ## Overview
 Execute implementation plans by dispatching fresh subagents per task with systematic two-stage review.
 **Core principle:** Fresh subagent per task + two-stage review (spec then quality) = high quality, fast iteration
 ## When to Use
 Use this skill when:
 - You have an implementation plan (from writing-plans skill)
 - Tasks are mostly independent
 - You want to stay in the current session
 - Quality and spec compliance are important
 **vs. Manual execution:**
 - Parallel task execution possible
 - Automated review process
 - Consistent quality checks
 - Better for complex multi-step plans
 ## The Process
 ### 1. Read and Parse Plan
 ```markdown
 [Read plan file once: docs/plans/feature-plan.md]
 [Extract all tasks with full text and context]
 [Create todo list with all tasks]
 ```
 **Action:** Read plan, extract tasks, create todo list.
 ### 2. Per-Task Workflow
 For EACH task in the plan:
 #### Step 1: Dispatch Implementer Subagent
 Use `delegate_task` with:
 - **goal:** Implement [specific task from plan]
 - **context:** Full task description from plan, project structure, relevant files
 - **toolsets:** ['terminal', 'file', 'web'] (or as needed)
 **Example:**
 ```python
 # Task: Add user authentication middleware
 delegate_task(
    goal="Implement JWT authentication middleware as specified in Task 3 of the plan",
    context="""
    Task from plan:
    - Create: src/middleware/auth.py
    - Validate JWT tokens from Authorization header
    - Return 401 for invalid tokens
    - Attach user info to request object
    Project structure:
    - Flask app in src/app.py
    - Uses PyJWT library
    - Existing middleware pattern in src/middleware/
    """,
    toolsets=['terminal', 'file']
 )
 ```
 #### Step 2: Implementer Subagent Works
 The subagent will:
 1. Ask questions if needed (you answer)
 2. Implement the task following TDD
 3. Write tests
 4. Run tests to verify
 5. Self-review
 6. Report completion
 **Your role:** Answer questions, provide context.
 #### Step 3: Spec Compliance Review
 Dispatch reviewer subagent:
 ```python
 delegate_task(
    goal="Review if implementation matches spec from plan",
    context="""
    Original task spec: [copy from plan]
    Implementation: [file paths and key code]
    Check:
    - All requirements from spec implemented?
    - File paths match spec?
    - Behavior matches spec?
    - Nothing extra added?
    """,
    toolsets=['file']
 )
 ```
 **If spec issues found:**
 - Subagent fixes gaps
 - Re-run spec review
 - Continue only when spec-compliant
 #### Step 4: Code Quality Review
 Dispatch quality reviewer:
 ```python
 delegate_task(
    goal="Review code quality and best practices",
    context="""
    Code to review: [file paths]
    Check:
    - Follows project style?
    - Proper error handling?
    - Good naming?
    - Test coverage adequate?
    - No obvious bugs?
    """,
    toolsets=['file']
 )
 ```
 **If quality issues found:**
 - Subagent fixes issues
 - Re-run quality review
 - Continue only when approved
 #### Step 5: Mark Complete
 Update todo list, mark task complete.
 ### 3. Final Review
 After ALL tasks complete:
 ```python
 delegate_task(
    goal="Review entire implementation for consistency",
    context="All tasks completed, review for integration issues",
    toolsets=['file']
 )
 ```
 ### 4. Branch Cleanup
 Use `finishing-a-development-branch` skill:
 - Verify all tests pass
 - Present merge options
 - Clean up worktree
 ## Task Granularity
 **Good task size:** 2-5 minutes of focused work
 **Examples:**
 **Too big:**
 - "Implement user authentication system"
 **Right size:**
 - "Create User model with email and password fields"
 - "Add password hashing function"
 - "Create login endpoint"
 - "Add JWT token generation"
 ## Communication Pattern
 ### You to Subagent
 **Provide:**
 - Clear task description
 - Exact file paths
 - Expected behavior
 - Success criteria
 - Relevant context
 **Example:**
 ```
 Task: Add email validation
 Files: Create src/validators/email.py
 Expected: Function returns True for valid emails, False for invalid
 Success: Tests pass for 10 test cases including edge cases
 Context: Used in user registration flow
 ```
 ### Subagent to You
 **Expect:**
 - Questions for clarification
 - Progress updates
 - Completion report
 - Self-review summary
 **Respond to:**
 - Answer questions promptly
 - Provide missing context
 - Approve approach decisions
 ## Two-Stage Review Details
 ### Stage 1: Spec Compliance
 **Checks:**
 - [ ] All requirements from plan implemented
 - [ ] File paths match specification
 - [ ] Function signatures match spec
 - [ ] Behavior matches expected
 - [ ] No scope creep (nothing extra)
 **Output:** PASS or list of spec gaps
 ### Stage 2: Code Quality
 **Checks:**
 - [ ] Follows language conventions
 - [ ] Consistent with project style
 - [ ] Clear variable/function names
 - [ ] Proper error handling
 - [ ] Adequate test coverage
 - [ ] No obvious bugs/edge cases missed
 - [ ] Documentation if needed
 **Output:** APPROVED or list of issues (critical/important/minor)
 ## Handling Issues
 ### Critical Issues
 **Examples:** Security vulnerability, broken functionality, data loss risk
 **Action:** Must fix before proceeding
 ### Important Issues
 **Examples:** Missing tests, poor error handling, unclear code
 **Action:** Should fix before proceeding
 ### Minor Issues
 **Examples:** Style inconsistency, minor refactoring opportunity
 **Action:** Note for later, optional fix
 ## Integration with Other Skills
 ### With test-driven-development
 Subagent should:
 1. Write failing test first
 2. Implement minimal code
 3. Verify test passes
 4. Commit
 ### With systematic-debugging
 If subagent encounters bugs:
 1. Pause implementation
 2. Debug systematically
 3. Fix root cause
 4. Resume
 ### With writing-plans
 This skill EXECUTES plans created by writing-plans skill.
 **Sequence:**
 1. brainstorming → writing-plans → subagent-driven-development
 ### With requesting-code-review
 After subagent completes task, use requesting-code-review skill for final validation.
 ## Common Patterns
 ### Pattern: Fresh Subagent Per Task
 **Why:** Prevents context pollution
 **How:** New delegate_task for each task
 **Result:** Each subagent has clean context
 ### Pattern: Two-Stage Review
 **Why:** Catch issues early, ensure quality
 **How:** Spec review → Quality review
 **Result:** High-quality, spec-compliant code
 ### Pattern: Frequent Checkpoints
 **Why:** Catch issues before they compound
 **How:** Review after each task
 **Result:** Issues don't cascade
 ## Best Practices
 1. **Clear Task Boundaries**
   - One task = one focused change
   - Independent where possible
   - Clear success criteria
 2. **Complete Context**
   - Provide all needed files
   - Explain project conventions
   - Share relevant examples
 3. **Review Discipline**
   - Don't skip spec review
   - Address critical issues immediately
   - Keep quality bar consistent
 4. **Communication**
   - Answer subagent questions quickly
   - Clarify when needed
   - Provide feedback on reviews
 ## Example Workflow
 ```markdown
 User: Implement user authentication
 You: I'll use subagent-driven development. Let me create a plan first.
 [Uses writing-plans skill]
 Plan created with 5 tasks:
 1. Create User model
 2. Add password hashing
 3. Implement login endpoint
 4. Add JWT middleware
 5. Create registration endpoint
 --- Task 1 ---
 [Dispatch implementer subagent for User model]
 [Subagent asks: "Should email be unique?"]
 You: Yes, email must be unique
 [Subagent implements]
 [Dispatch spec reviewer - PASS]
 [Dispatch quality reviewer - APPROVED]
 Task 1 complete
 --- Task 2 ---
 [Dispatch implementer for password hashing]
 ...
 [After all tasks]
 [Final review]
 [Merge branch]
 ```
 ## Remember
 ```
 Fresh subagent per task
 Two-stage review every time
 Spec compliance first
 Code quality second
 Never skip reviews
 Catch issues early
 ```
 **Quality is not an accident. It's the result of systematic process.**
--- a/skills/software-development/systematic-debugging/SKILL.md
+++ b/skills/software-development/systematic-debugging/SKILL.md
@ -0,0 +1,443 @@
 ---
 name: systematic-debugging
 description: Use when encountering any bug, test failure, or unexpected behavior. 4-phase root cause investigation process - NO fixes without understanding the problem first.
 version: 1.0.0
 author: Hermes Agent (adapted from Superpowers)
 license: MIT
 metadata:
  hermes:
    tags: [debugging, troubleshooting, problem-solving, root-cause, investigation]
    related_skills: [test-driven-development, writing-plans, subagent-driven-development]
 ---
 # Systematic Debugging
 ## Overview
 Random fixes waste time and create new bugs. Quick patches mask underlying issues.
 **Core principle:** ALWAYS find root cause before attempting fixes. Symptom fixes are failure.
 **Violating the letter of this process is violating the spirit of debugging.**
 ## The Iron Law
 ```
 NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
 ```
 If you haven't completed Phase 1, you cannot propose fixes.
 ## When to Use
 Use for ANY technical issue:
 - Test failures
 - Bugs in production
 - Unexpected behavior
 - Performance problems
 - Build failures
 - Integration issues
 **Use this ESPECIALLY when:**
 - Under time pressure (emergencies make guessing tempting)
 - "Just one quick fix" seems obvious
 - You've already tried multiple fixes
 - Previous fix didn't work
 - You don't fully understand the issue
 **Don't skip when:**
 - Issue seems simple (simple bugs have root causes too)
 - You're in a hurry (rushing guarantees rework)
 - Manager wants it fixed NOW (systematic is faster than thrashing)
 ## The Four Phases
 You MUST complete each phase before proceeding to the next.
 ---
 ## Phase 1: Root Cause Investigation
 **BEFORE attempting ANY fix:**
 ### 1. Read Error Messages Carefully
 - Don't skip past errors or warnings
 - They often contain the exact solution
 - Read stack traces completely
 - Note line numbers, file paths, error codes
 **Action:** Copy full error message to your notes.
 ### 2. Reproduce Consistently
 - Can you trigger it reliably?
 - What are the exact steps?
 - Does it happen every time?
 - If not reproducible → gather more data, don't guess
 **Action:** Write down exact reproduction steps.
 ### 3. Check Recent Changes
 - What changed that could cause this?
 - Git diff, recent commits
 - New dependencies, config changes
 - Environmental differences
 **Commands:**
 ```bash
 # Recent commits
 git log --oneline -10
 # Uncommitted changes
 git diff
 # Changes in specific file
 git log -p --follow src/problematic_file.py
 ```
 ### 4. Gather Evidence in Multi-Component Systems
 **WHEN system has multiple components (CI pipeline, API service, database layer):**
 **BEFORE proposing fixes, add diagnostic instrumentation:**
 For EACH component boundary:
 - Log what data enters component
 - Log what data exits component
 - Verify environment/config propagation
 - Check state at each layer
 Run once to gather evidence showing WHERE it breaks
 THEN analyze evidence to identify failing component
 THEN investigate that specific component
 **Example (multi-layer system):**
 ```python
 # Layer 1: Entry point
 def entry_point(input_data):
    print(f"DEBUG: Input received: {input_data}")
    result = process_layer1(input_data)
    print(f"DEBUG: Layer 1 output: {result}")
    return result
 # Layer 2: Processing
 def process_layer1(data):
    print(f"DEBUG: Layer 1 received: {data}")
    # ... processing ...
    print(f"DEBUG: Layer 1 returning: {result}")
    return result
 ```
 **Action:** Add logging, run once, analyze output.
 ### 5. Isolate the Problem
 - Comment out code until problem disappears
 - Binary search through recent changes
 - Create minimal reproduction case
 - Test with fresh environment
 **Action:** Create minimal reproduction case.
 ### Phase 1 Completion Checklist
 - [ ] Error messages fully read and understood
 - [ ] Issue reproduced consistently
 - [ ] Recent changes identified and reviewed
 - [ ] Evidence gathered (logs, state)
 - [ ] Problem isolated to specific component/code
 - [ ] Root cause hypothesis formed
 **STOP:** Do not proceed to Phase 2 until you understand WHY it's happening.
 ---
 ## Phase 2: Solution Design
 **Given the root cause, design the fix:**
 ### 1. Understand the Fix Area
 - Read relevant code thoroughly
 - Understand data flow
 - Identify affected components
 - Check for similar issues elsewhere
 **Action:** Read all relevant code files.
 ### 2. Design Minimal Fix
 - Smallest change that fixes root cause
 - Avoid scope creep
 - Don't refactor while fixing
 - Fix one issue at a time
 **Action:** Write down the exact fix before coding.
 ### 3. Consider Side Effects
 - What else could this change affect?
 - Are there dependencies?
 - Will this break other functionality?
 **Action:** Identify potential side effects.
 ### Phase 2 Completion Checklist
 - [ ] Fix area code fully understood
 - [ ] Minimal fix designed
 - [ ] Side effects identified
 - [ ] Fix approach documented
 ---
 ## Phase 3: Implementation
 **Make the fix:**
 ### 1. Write Test First (if possible)
 ```python
 def test_should_handle_empty_input():
    """Regression test for bug #123"""
    result = process_data("")
    assert result == expected_empty_result
 ```
 ### 2. Implement Fix
 ```python
 # Before (buggy)
 def process_data(data):
    return data.split(",")[0]
 # After (fixed)
 def process_data(data):
    if not data:
        return ""
    return data.split(",")[0]
 ```
 ### 3. Verify Fix
 ```bash
 # Run the specific test
 pytest tests/test_data.py::test_should_handle_empty_input -v
 # Run all tests to check for regressions
 pytest
 ```
 ### Phase 3 Completion Checklist
 - [ ] Test written that reproduces the bug
 - [ ] Minimal fix implemented
 - [ ] Test passes
 - [ ] No regressions introduced
 ---
 ## Phase 4: Verification
 **Confirm it's actually fixed:**
 ### 1. Reproduce Original Issue
 - Follow original reproduction steps
 - Verify issue is resolved
 - Test edge cases
 ### 2. Regression Testing
 ```bash
 # Full test suite
 pytest
 # Integration tests
 pytest tests/integration/
 # Check related areas
 pytest -k "related_feature"
 ```
 ### 3. Monitor After Deploy
 - Watch logs for related errors
 - Check metrics
 - Verify fix in production
 ### Phase 4 Completion Checklist
 - [ ] Original issue cannot be reproduced
 - [ ] All tests pass
 - [ ] No new warnings/errors
 - [ ] Fix documented (commit message, comments)
 ---
 ## Debugging Techniques
 ### Root Cause Tracing
 Ask "why" 5 times:
 1. Why did it fail? → Null pointer
 2. Why was it null? → Function returned null
 3. Why did function return null? → Missing validation
 4. Why was validation missing? → Assumed input always valid
 5. Why was that assumption wrong? → API changed
 **Root cause:** API change not accounted for
 ### Defense in Depth
 Don't fix just the symptom:
 **Bad:** Add null check at crash site
 **Good:** 
 1. Add validation at API boundary
 2. Add null check at crash site
 3. Add test for both
 4. Document API behavior
 ### Condition-Based Waiting
 For timing/race conditions:
 ```python
 # Bad - arbitrary sleep
 import time
 time.sleep(5)  # "Should be enough"
 # Good - wait for condition
 from tenacity import retry, wait_exponential, stop_after_attempt
@retry(wait=wait_exponential(multiplier=1, min=4, max=10),
       stop=stop_after_attempt(5))
 def wait_for_service():
    response = requests.get(health_url)
    assert response.status_code == 200
 ```
 ---
 ## Common Debugging Pitfalls
 ### Fix Without Understanding
 **Symptom:** "Just add a try/catch"
 **Problem:** Masks the real issue
 **Solution:** Complete Phase 1 before any fix
 ### Shotgun Debugging
 **Symptom:** Change 5 things at once
 **Problem:** Don't know what fixed it
 **Solution:** One change at a time, verify each
 ### Premature Optimization
 **Symptom:** Rewrite while debugging
 **Problem:** Introduces new bugs
 **Solution:** Fix first, refactor later
 ### Assuming Environment
 **Symptom:** "Works on my machine"
 **Problem:** Environment differences
 **Solution:** Check environment variables, versions, configs
 ---
 ## Language-Specific Tools
 ### Python
 ```python
 # Add debugger
 import pdb; pdb.set_trace()
 # Or use ipdb for better experience
 import ipdb; ipdb.set_trace()
 # Log state
 import logging
 logging.debug(f"Variable state: {variable}")
 # Stack trace
 import traceback
 traceback.print_exc()
 ```
 ### JavaScript/TypeScript
 ```javascript
 // Debugger
 debugger;
 // Console with context
 console.log("State:", { var1, var2, var3 });
 // Stack trace
 console.trace("Here");
 // Error with context
 throw new Error(`Failed with input: ${JSON.stringify(input)}`);
 ```
 ### Go
 ```go
 // Print state
 fmt.Printf("Debug: variable=%+v\n", variable)
 // Stack trace
 import "runtime/debug"
 debug.PrintStack()
 // Panic with context
 if err != nil {
    panic(fmt.Sprintf("unexpected error: %v", err))
 }
 ```
 ---
 ## Integration with Other Skills
 ### With test-driven-development
 When debugging:
 1. Write test that reproduces bug
 2. Debug systematically
 3. Fix root cause
 4. Test passes
 ### With writing-plans
 Include debugging tasks in plans:
 - "Add diagnostic logging"
 - "Create reproduction test"
 - "Verify fix resolves issue"
 ### With subagent-driven-development
 If subagent gets stuck:
 1. Switch to systematic debugging
 2. Analyze root cause
 3. Provide findings to subagent
 4. Resume implementation
 ---
 ## Remember
 ```
 PHASE 1: Investigate → Understand WHY
 PHASE 2: Design → Plan the fix
 PHASE 3: Implement → Make the fix
 PHASE 4: Verify → Confirm it's fixed
 ```
 **No shortcuts. No guessing. Systematic always wins.**
--- a/skills/software-development/test-driven-development/SKILL.md
+++ b/skills/software-development/test-driven-development/SKILL.md
@ -0,0 +1,384 @@
 ---
 name: test-driven-development
 description: Use when implementing any feature or bugfix, before writing implementation code. Enforces RED-GREEN-REFACTOR cycle with test-first approach.
 version: 1.0.0
 author: Hermes Agent (adapted from Superpowers)
 license: MIT
 metadata:
  hermes:
    tags: [testing, tdd, development, quality, red-green-refactor]
    related_skills: [systematic-debugging, writing-plans, subagent-driven-development]
 ---
 # Test-Driven Development (TDD)
 ## Overview
 Write the test first. Watch it fail. Write minimal code to pass.
 **Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing.
 **Violating the letter of the rules is violating the spirit of the rules.**
 ## When to Use
 **Always:**
 - New features
 - Bug fixes
 - Refactoring
 - Behavior changes
 **Exceptions (ask your human partner):**
 - Throwaway prototypes
 - Generated code
 - Configuration files
 Thinking "skip TDD just this once"? Stop. That's rationalization.
 ## The Iron Law
 ```
 NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
 ```
 Write code before the test? Delete it. Start over.
 **No exceptions:**
 - Don't keep it as "reference"
 - Don't "adapt" it while writing tests
 - Don't look at it
 - Delete means delete
 Implement fresh from tests. Period.
 ## Red-Green-Refactor Cycle
 ### RED - Write Failing Test
 Write one minimal test showing what should happen.
 **Good Example:**
 ```python
 def test_retries_failed_operations_3_times():
    attempts = 0
    def operation():
        nonlocal attempts
        attempts += 1
        if attempts < 3:
            raise Exception('fail')
        return 'success'
    result = retry_operation(operation)
    assert result == 'success'
    assert attempts == 3
 ```
 - Clear name, tests real behavior, one thing
 **Bad Example:**
 ```python
 def test_retry_works():
    mock = MagicMock()
    mock.side_effect = [Exception(), Exception(), 'success']
    result = retry_operation(mock)
    assert result == 'success'  # What about retry count? Timing?
 ```
 - Vague name, mocks behavior not reality, unclear what it tests
 ### Verify RED
 Run the test. It MUST fail.
 **If it passes:**
 - Test is wrong (not testing what you think)
 - Code already exists (delete it, start over)
 - Wrong test file running
 **What to check:**
 - Error message makes sense
 - Fails for expected reason
 - Stack trace points to right place
 ### GREEN - Minimal Code
 Write just enough code to pass. Nothing more.
 **Good:**
 ```python
 def add(a, b):
    return a + b  # Nothing extra
 ```
 **Bad:**
 ```python
 def add(a, b):
    result = a + b
    logging.info(f"Adding {a} + {b} = {result}")  # Extra!
    return result
 ```
 Cheating is OK in GREEN:
 - Hardcode return values
 - Copy-paste
 - Duplicate code
 - Skip edge cases
 **We'll fix it in refactor.**
 ### Verify GREEN
 Run tests. All must pass.
 **If fails:**
 - Fix minimal code
 - Don't expand scope
 - Stay in GREEN
 ### REFACTOR - Clean Up
 Now improve the code while keeping tests green.
 **Safe refactorings:**
 - Rename variables/functions
 - Extract helper functions
 - Remove duplication
 - Simplify expressions
 - Improve readability
 **Golden rule:** Tests stay green throughout.
 **If tests fail during refactor:**
 - Undo immediately
 - Smaller refactoring steps
 - Check you didn't change behavior
 ## Implementation Workflow
 ### 1. Create Test File
 ```bash
 # Python
 touch tests/test_feature.py
 # JavaScript
 touch tests/feature.test.js
 # Rust
 touch tests/feature_tests.rs
 ```
 ### 2. Write First Failing Test
 ```python
 # tests/test_calculator.py
 def test_adds_two_numbers():
    calc = Calculator()
    result = calc.add(2, 3)
    assert result == 5
 ```
 ### 3. Run and Verify Failure
 ```bash
 pytest tests/test_calculator.py -v
 # Expected: FAIL - Calculator not defined
 ```
 ### 4. Write Minimal Implementation
 ```python
 # src/calculator.py
 class Calculator:
    def add(self, a, b):
        return a + b  # Minimal!
 ```
 ### 5. Run and Verify Pass
 ```bash
 pytest tests/test_calculator.py -v
 # Expected: PASS
 ```
 ### 6. Commit
 ```bash
 git add tests/test_calculator.py src/calculator.py
 git commit -m "feat: add calculator with add method"
 ```
 ### 7. Next Test
 ```python
 def test_adds_negative_numbers():
    calc = Calculator()
    result = calc.add(-2, -3)
    assert result == -5
 ```
 Repeat cycle.
 ## Testing Anti-Patterns
 ### Mocking What You Don't Own
 **Bad:** Mock database, HTTP client, file system
 **Good:** Abstract behind interface, test interface
 ### Testing Implementation Details
 **Bad:** Test that function was called
 **Good:** Test the result/behavior
 ### Happy Path Only
 **Bad:** Only test expected inputs
 **Good:** Test edge cases, errors, boundaries
 ### Brittle Tests
 **Bad:** Test breaks when refactoring
 **Good:** Tests verify behavior, not structure
 ## Common Pitfalls
 ### "I'll Write Tests After"
 No, you won't. Write them first.
 ### "This is Too Simple to Test"
 Simple bugs cause complex problems. Test everything.
 ### "I Need to See It Work First"
 Temporary code becomes permanent. Test first.
 ### "Tests Take Too Long"
 Untested code takes longer. Invest in tests.
 ## Language-Specific Commands
 ### Python (pytest)
 ```bash
 # Run all tests
 pytest
 # Run specific test
 pytest tests/test_feature.py::test_name -v
 # Run with coverage
 pytest --cov=src --cov-report=term-missing
 # Watch mode
 pytest-watch
 ```
 ### JavaScript (Jest)
 ```bash
 # Run all tests
 npm test
 # Run specific test
 npm test -- test_name
 # Watch mode
 npm test -- --watch
 # Coverage
 npm test -- --coverage
 ```
 ### TypeScript (Jest with ts-jest)
 ```bash
 # Run tests
 npx jest
 # Run specific file
 npx jest tests/feature.test.ts
 ```
 ### Go
 ```bash
 # Run all tests
 go test ./...
 # Run specific test
 go test -run TestName
 # Verbose
 go test -v
 # Coverage
 go test -cover
 ```
 ### Rust
 ```bash
 # Run tests
 cargo test
 # Run specific test
 cargo test test_name
 # Show output
 cargo test -- --nocapture
 ```
 ## Integration with Other Skills
 ### With writing-plans
 Every plan task should specify:
 - What test to write
 - Expected test failure
 - Minimal implementation
 - Refactoring opportunities
 ### With systematic-debugging
 When fixing bugs:
 1. Write test that reproduces bug
 2. Verify test fails (RED)
 3. Fix bug (GREEN)
 4. Refactor if needed
 ### With subagent-driven-development
 Subagent implements one test at a time:
 1. Write failing test
 2. Minimal code to pass
 3. Commit
 4. Next test
 ## Success Indicators
 **You're doing TDD right when:**
 - Tests fail before code exists
 - You write <10 lines between test runs
 - Refactoring feels safe
 - Bugs are caught immediately
 - Code is simpler than expected
 **Red flags:**
 - Writing 50+ lines without running tests
 - Tests always pass
 - Fear of refactoring
 - "I'll test later"
 ## Remember
 1. **RED** - Write failing test
 2. **GREEN** - Minimal code to pass  
 3. **REFACTOR** - Clean up safely
 4. **REPEAT** - Next behavior
 **If you didn't see it fail, it doesn't test what you think.**
--- a/skills/software-development/writing-plans/SKILL.md
+++ b/skills/software-development/writing-plans/SKILL.md
@ -0,0 +1,453 @@
 ---
 name: writing-plans
 description: Use when you have a spec or requirements for a multi-step task. Creates comprehensive implementation plans with bite-sized tasks, exact file paths, and complete code examples.
 version: 1.0.0
 author: Hermes Agent (adapted from Superpowers)
 license: MIT
 metadata:
  hermes:
    tags: [planning, design, implementation, workflow, documentation]
    related_skills: [subagent-driven-development, test-driven-development, requesting-code-review]
 ---
 # Writing Implementation Plans
 ## Overview
 Transform specifications into actionable implementation plans. Write comprehensive plans that any developer can follow - even with zero context about your codebase.
 **Core principle:** Document everything: exact file paths, complete code, test commands, verification steps.
 **Assume the implementer:**
 - Is a skilled developer
 - Knows almost nothing about your codebase
 - Has questionable taste in code style
 - Needs explicit guidance
 ## When to Use
 **Always use this skill:**
 - Before implementing multi-step features
 - After design approval (from brainstorming)
 - When breaking down complex requirements
 - Before delegating to subagents
 **Don't skip when:**
 - Feature seems simple (assumptions cause bugs)
 - You plan to implement yourself (future you needs guidance)
 - Working alone (documentation matters)
 ## Plan Document Structure
 ### Header (Required)
 Every plan MUST start with:
 ```markdown
 # [Feature Name] Implementation Plan
 > **For Hermes:** Use subagent-driven-development or executing-plans skill to implement this plan.
 **Goal:** One sentence describing what this builds
 **Architecture:** 2-3 sentences about approach
 **Tech Stack:** Key technologies/libraries
 ---
 ```
 ### Task Structure
 Each task follows this format:
 ```markdown
 ### Task N: [Descriptive Name]
 **Objective:** What this task accomplishes (one sentence)
 **Files:**
 - Create: `exact/path/to/new_file.py`
 - Modify: `exact/path/to/existing.py:45-67` (line numbers if known)
 - Test: `tests/path/to/test_file.py`
 **Implementation Steps:**
 **Step 1: [Action description]**
 ```python
 # Complete code to write
 class NewClass:
    def __init__(self):
        self.value = None
    def process(self, input):
        return input.upper()
 ```
 **Step 2: [Action description]**
 ```bash
 # Command to run
 pytest tests/test_new.py -v
 ```
 Expected: Tests pass with 3 green dots
 **Step 3: [Action description]**
 ```python
 # More code if needed
 ```
 **Verification:**
 - [ ] Test passes
 - [ ] Function returns expected output
 - [ ] No syntax errors
 **Commit:**
 ```bash
 git add src/new_file.py tests/test_new.py
 git commit -m "feat: add new feature component"
 ```
 ```
 ## Task Granularity
 **Each task = 2-5 minutes of work**
 **Break down:**
 - "Write failing test" - one task
 - "Run test to verify it fails" - one task  
 - "Implement minimal code" - one task
 - "Run test to verify pass" - one task
 - "Commit" - one task
 **Examples:**
 **Too big:**
 ```markdown
 ### Task 1: Build authentication system
 [50 lines of code across 5 files]
 ```
 **Right size:**
 ```markdown
 ### Task 1: Create User model with email field
 [10 lines, 1 file]
 ### Task 2: Add password hash field to User
 [8 lines, 1 file]
 ### Task 3: Create password hashing utility
 [15 lines, 1 file]
 ```
 ## Principles
 ### DRY (Don't Repeat Yourself)
 **Bad:** Copy-paste validation in 3 places
 **Good:** Extract validation function, use everywhere
 ```python
 # Good - DRY
 def validate_email(email):
    if not re.match(r'^[^@]+@[^@]+$', email):
        raise ValueError("Invalid email")
    return email
 # Use everywhere
 validate_email(user_input)
 validate_email(config_email)
 validate_email(imported_data)
 ```
 ### YAGNI (You Aren't Gonna Need It)
 **Bad:** Add "flexibility" for future requirements
 **Good:** Implement only what's needed now
 ```python
 # Bad - YAGNI violation
 class User:
    def __init__(self, name, email):
        self.name = name
        self.email = email
        self.preferences = {}  # Not needed yet!
        self.metadata = {}     # Not needed yet!
        self.settings = {}     # Not needed yet!
 # Good - YAGNI
 class User:
    def __init__(self, name, email):
        self.name = name
        self.email = email
 ```
 ### TDD (Test-Driven Development)
 Every task that produces code should include:
 1. Write failing test
 2. Run to verify failure
 3. Write minimal code
 4. Run to verify pass
 See `test-driven-development` skill for details.
 ### Frequent Commits
 Commit after every task:
 ```bash
 git add [files]
 git commit -m "type: description"
 ```
 ## Writing Process
 ### Step 1: Understand Requirements
 Read and understand:
 - Feature requirements
 - Design documents
 - Acceptance criteria
 - Constraints
 ### Step 2: Explore Codebase
 ```bash
 # Understand project structure
 find src -type f -name "*.py" | head -20
 # Look at similar features
 grep -r "similar_pattern" src/
 # Check existing tests
 ls tests/
 ```
 ### Step 3: Design Approach
 Decide:
 - Architecture pattern
 - File organization
 - Dependencies needed
 - Testing strategy
 ### Step 4: Write Tasks
 Create tasks in order:
 1. Setup/infrastructure
 2. Core functionality
 3. Edge cases
 4. Integration
 5. Cleanup
 ### Step 5: Add Details
 For each task, add:
 - Exact file paths
 - Complete code examples
 - Exact commands
 - Expected outputs
 - Verification steps
 ### Step 6: Review Plan
 Check:
 - [ ] Tasks are sequential and logical
 - [ ] Each task is bite-sized (2-5 min)
 - [ ] File paths are exact
 - [ ] Code examples are complete
 - [ ] Commands are exact with expected output
 - [ ] No missing context
 ### Step 7: Save Plan
 ```bash
 # Create plans directory
 mkdir -p docs/plans
 # Save plan
 cat > docs/plans/YYYY-MM-DD-feature-name.md << 'EOF'
 [plan content]
 EOF
 # Commit plan
 git add docs/plans/YYYY-MM-DD-feature-name.md
 git commit -m "docs: add implementation plan for feature"
 ```
 ## Example Plan
 ```markdown
 # User Authentication Implementation Plan
 > **For Hermes:** Use subagent-driven-development to implement this plan.
 **Goal:** Add JWT-based user authentication to the Flask API
 **Architecture:** Use PyJWT for tokens, bcrypt for hashing. Middleware validates tokens on protected routes.
 **Tech Stack:** Python, Flask, PyJWT, bcrypt
 ---
 ### Task 1: Create User model
 **Objective:** Define User model with email and hashed password
 **Files:**
 - Create: `src/models/user.py`
 - Test: `tests/models/test_user.py`
 **Step 1: Write failing test**
 ```python
 def test_user_creation():
    user = User(email="test@example.com", password="secret123")
    assert user.email == "test@example.com"
    assert user.password_hash is not None
    assert user.password_hash != "secret123"  # Should be hashed
 ```
 **Step 2: Run to verify failure**
 ```bash
 pytest tests/models/test_user.py -v
 ```
 Expected: FAIL - User class not defined
 **Step 3: Implement User model**
 ```python
 import bcrypt
 class User:
    def __init__(self, email, password):
        self.email = email
        self.password_hash = bcrypt.hashpw(
            password.encode(), 
            bcrypt.gensalt()
        )
 ```
 **Step 4: Run to verify pass**
 ```bash
 pytest tests/models/test_user.py -v
 ```
 Expected: PASS
 **Commit:**
 ```bash
 git add src/models/user.py tests/models/test_user.py
 git commit -m "feat: add User model with password hashing"
 ```
 ### Task 2: Create login endpoint
 **Objective:** Add POST /login endpoint that returns JWT
 **Files:**
 - Modify: `src/app.py`
 - Test: `tests/test_login.py`
 [Continue...]
 ```
 ## Common Mistakes
 ### Vague Tasks
 **Bad:**
 ```markdown
 ### Task 1: Add authentication
 ```
 **Good:**
 ```markdown
 ### Task 1: Create User model with email and password_hash fields
 ```
 ### Incomplete Code
 **Bad:**
 ```markdown
 Step 1: Add validation function
 ```
 **Good:**
 ```markdown
 Step 1: Add validation function
 ```python
 def validate_email(email):
    """Validate email format."""
    import re
    pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    if not re.match(pattern, email):
        raise ValueError(f"Invalid email: {email}")
    return email
 ```
 ### Missing Verification
 **Bad:**
 ```markdown
 Step 3: Test it works
 ```
 **Good:**
 ```markdown
 Step 3: Verify authentication
 ```bash
 curl -X POST http://localhost:5000/login \
  -H "Content-Type: application/json" \
  -d '{"email":"test@example.com","password":"secret123"}'
 ```
 Expected: Returns 200 with JWT token in response
 ```
 ## Integration with Other Skills
 ### With brainstorming
 **Sequence:**
 1. brainstorming → Explore and refine design
 2. writing-plans → Create implementation plan
 3. subagent-driven-development → Execute plan
 ### With subagent-driven-development
 Plans feed into subagent-driven-development:
 - Subagents implement each task
 - Two-stage review ensures quality
 - Plan provides context and requirements
 ### With test-driven-development
 Every code-producing task should follow TDD:
 1. Write failing test
 2. Verify failure
 3. Write minimal code
 4. Verify pass
 ## Success Checklist
 Before considering a plan complete:
 - [ ] Header with goal, architecture, tech stack
 - [ ] All tasks are bite-sized (2-5 min each)
 - [ ] Exact file paths for every file
 - [ ] Complete code examples (not partial)
 - [ ] Exact commands with expected output
 - [ ] Verification steps for each task
 - [ ] Commit commands included
 - [ ] DRY, YAGNI, TDD principles applied
 - [ ] Tasks are sequential and logical
 - [ ] Plan saved to docs/plans/
 ## Remember
 ```
 Bite-sized tasks
 Exact file paths
 Complete code
 Exact commands
 Verification steps
 DRY, YAGNI, TDD
 ```
 **A good plan makes implementation obvious.**