mirror of https://github.com/anthropics/claude-code.git synced 2025-11-28 08:40:27 +08:00

Daisy S. Hollman 387dc35db7

feat: Add plugin-dev toolkit for comprehensive plugin development

Adds the plugin-dev plugin to public marketplace. A comprehensive toolkit for
developing Claude Code plugins with 7 expert skills, 3 AI-assisted agents, and
extensive documentation covering the complete plugin development lifecycle.

Key features:
- 7 skills: hook-development, mcp-integration, plugin-structure, plugin-settings,
  command-development, agent-development, skill-development
- 3 agents: agent-creator (AI-assisted generation), plugin-validator (structure
  validation), skill-reviewer (quality review)
- 1 command: /plugin-dev:create-plugin (guided 8-phase workflow)
- 10 utility scripts for validation and testing
- 21 reference docs with deep-dive guidance (~11k words)
- 9 working examples demonstrating best practices

Changes for public release:
- Replaced all references to internal repositories with "Claude Code"
- Updated MCP examples: internal.company.com → api.example.com
- Updated token variables: ${INTERNAL_TOKEN} → ${API_TOKEN}
- Reframed agent-creation-system-prompt as "proven in production"
- Preserved all ${CLAUDE_PLUGIN_ROOT} references (186 total)
- Preserved valuable test blocks in core modules

Validation:
- All 3 agents validated successfully with validate-agent.sh
- All JSON files validated with jq
- Zero internal references remaining
- 59 files migrated, 21,971 lines added

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-17 04:09:00 -08:00

14 KiB

Raw Permalink Blame History

Command Testing Strategies

Comprehensive strategies for testing slash commands before deployment and distribution.

Overview

Testing commands ensures they work correctly, handle edge cases, and provide good user experience. A systematic testing approach catches issues early and builds confidence in command reliability.

Testing Levels

Level 1: Syntax and Structure Validation

What to test:

YAML frontmatter syntax
Markdown format
File location and naming

How to test:

# Validate YAML frontmatter
head -n 20 .claude/commands/my-command.md | grep -A 10 "^---"

# Check for closing frontmatter marker
head -n 20 .claude/commands/my-command.md | grep -c "^---" # Should be 2

# Verify file has .md extension
ls .claude/commands/*.md

# Check file is in correct location
test -f .claude/commands/my-command.md && echo "Found" || echo "Missing"

Automated validation script:

#!/bin/bash
# validate-command.sh

COMMAND_FILE="$1"

if [ ! -f "$COMMAND_FILE" ]; then
  echo "ERROR: File not found: $COMMAND_FILE"
  exit 1
fi

# Check .md extension
if [[ ! "$COMMAND_FILE" =~ \.md$ ]]; then
  echo "ERROR: File must have .md extension"
  exit 1
fi

# Validate YAML frontmatter if present
if head -n 1 "$COMMAND_FILE" | grep -q "^---"; then
  # Count frontmatter markers
  MARKERS=$(head -n 50 "$COMMAND_FILE" | grep -c "^---")
  if [ "$MARKERS" -ne 2 ]; then
    echo "ERROR: Invalid YAML frontmatter (need exactly 2 '---' markers)"
    exit 1
  fi
  echo "✓ YAML frontmatter syntax valid"
fi

# Check for empty file
if [ ! -s "$COMMAND_FILE" ]; then
  echo "ERROR: File is empty"
  exit 1
fi

echo "✓ Command file structure valid"

Level 2: Frontmatter Field Validation

What to test:

Field types correct
Values in valid ranges
Required fields present (if any)

Validation script:

#!/bin/bash
# validate-frontmatter.sh

COMMAND_FILE="$1"

# Extract YAML frontmatter
FRONTMATTER=$(sed -n '/^---$/,/^---$/p' "$COMMAND_FILE" | sed '1d;$d')

if [ -z "$FRONTMATTER" ]; then
  echo "No frontmatter to validate"
  exit 0
fi

# Check 'model' field if present
if echo "$FRONTMATTER" | grep -q "^model:"; then
  MODEL=$(echo "$FRONTMATTER" | grep "^model:" | cut -d: -f2 | tr -d ' ')
  if ! echo "sonnet opus haiku" | grep -qw "$MODEL"; then
    echo "ERROR: Invalid model '$MODEL' (must be sonnet, opus, or haiku)"
    exit 1
  fi
  echo "✓ Model field valid: $MODEL"
fi

# Check 'allowed-tools' field format
if echo "$FRONTMATTER" | grep -q "^allowed-tools:"; then
  echo "✓ allowed-tools field present"
  # Could add more sophisticated validation here
fi

# Check 'description' length
if echo "$FRONTMATTER" | grep -q "^description:"; then
  DESC=$(echo "$FRONTMATTER" | grep "^description:" | cut -d: -f2-)
  LENGTH=${#DESC}
  if [ "$LENGTH" -gt 80 ]; then
    echo "WARNING: Description length $LENGTH (recommend < 60 chars)"
  else
    echo "✓ Description length acceptable: $LENGTH chars"
  fi
fi

echo "✓ Frontmatter fields valid"

Level 3: Manual Command Invocation

What to test:

Command appears in /help
Command executes without errors
Output is as expected

Test procedure:

# 1. Start Claude Code
claude --debug

# 2. Check command appears in help
> /help
# Look for your command in the list

# 3. Invoke command without arguments
> /my-command
# Check for reasonable error or behavior

# 4. Invoke with valid arguments
> /my-command arg1 arg2
# Verify expected behavior

# 5. Check debug logs
tail -f ~/.claude/debug-logs/latest
# Look for errors or warnings

Level 4: Argument Testing

What to test:

Positional arguments work ($1, $2, etc.)
$ARGUMENTS captures all arguments
Missing arguments handled gracefully
Invalid arguments detected

Test matrix:

Test Case	Command	Expected Result
No args	`/cmd`	Graceful handling or useful message
One arg	`/cmd arg1`	$1 substituted correctly
Two args	`/cmd arg1 arg2`	$1 and $2 substituted
Extra args	`/cmd a b c d`	All captured or extras ignored appropriately
Special chars	`/cmd "arg with spaces"`	Quotes handled correctly
Empty arg	`/cmd ""`	Empty string handled

Test script:

#!/bin/bash
# test-command-arguments.sh

COMMAND="$1"

echo "Testing argument handling for /$COMMAND"
echo

echo "Test 1: No arguments"
echo "  Command: /$COMMAND"
echo "  Expected: [describe expected behavior]"
echo "  Manual test required"
echo

echo "Test 2: Single argument"
echo "  Command: /$COMMAND test-value"
echo "  Expected: 'test-value' appears in output"
echo "  Manual test required"
echo

echo "Test 3: Multiple arguments"
echo "  Command: /$COMMAND arg1 arg2 arg3"
echo "  Expected: All arguments used appropriately"
echo "  Manual test required"
echo

echo "Test 4: Special characters"
echo "  Command: /$COMMAND \"value with spaces\""
echo "  Expected: Entire phrase captured"
echo "  Manual test required"

Level 5: File Reference Testing

What to test:

@ syntax loads file contents
Non-existent files handled
Large files handled appropriately
Multiple file references work

Test procedure:

# Create test files
echo "Test content" > /tmp/test-file.txt
echo "Second file" > /tmp/test-file-2.txt

# Test single file reference
> /my-command /tmp/test-file.txt
# Verify file content is read

# Test non-existent file
> /my-command /tmp/nonexistent.txt
# Verify graceful error handling

# Test multiple files
> /my-command /tmp/test-file.txt /tmp/test-file-2.txt
# Verify both files processed

# Test large file
dd if=/dev/zero of=/tmp/large-file.bin bs=1M count=100
> /my-command /tmp/large-file.bin
# Verify reasonable behavior (may truncate or warn)

# Cleanup
rm /tmp/test-file*.txt /tmp/large-file.bin

Level 6: Bash Execution Testing

What to test:

!` commands execute correctly
Command output included in prompt
Command failures handled
Security: only allowed commands run

Test procedure:

# Create test command with bash execution
cat > .claude/commands/test-bash.md << 'EOF'
---
description: Test bash execution
allowed-tools: Bash(echo:*), Bash(date:*)
---

Current date: !`date`
Test output: !`echo "Hello from bash"`

Analysis of output above...
EOF

# Test in Claude Code
> /test-bash
# Verify:
# 1. Date appears correctly
# 2. Echo output appears
# 3. No errors in debug logs

# Test with disallowed command (should fail or be blocked)
cat > .claude/commands/test-forbidden.md << 'EOF'
---
description: Test forbidden command
allowed-tools: Bash(echo:*)
---

Trying forbidden: !`ls -la /`
EOF

> /test-forbidden
# Verify: Permission denied or appropriate error

Level 7: Integration Testing

What to test:

Commands work with other plugin components
Commands interact correctly with each other
State management works across invocations
Workflow commands execute in sequence

Test scenarios:

Scenario 1: Command + Hook Integration

# Setup: Command that triggers a hook
# Test: Invoke command, verify hook executes

# Command: .claude/commands/risky-operation.md
# Hook: PreToolUse that validates the operation

> /risky-operation
# Verify: Hook executes and validates before command completes

Scenario 2: Command Sequence

# Setup: Multi-command workflow
> /workflow-init
# Verify: State file created

> /workflow-step2
# Verify: State file read, step 2 executes

> /workflow-complete
# Verify: State file cleaned up

Scenario 3: Command + MCP Integration

# Setup: Command uses MCP tools
# Test: Verify MCP server accessible

> /mcp-command
# Verify:
# 1. MCP server starts (if stdio)
# 2. Tool calls succeed
# 3. Results included in output

Automated Testing Approaches

Command Test Suite

Create a test suite script:

#!/bin/bash
# test-commands.sh - Command test suite

TEST_DIR=".claude/commands"
FAILED_TESTS=0

echo "Command Test Suite"
echo "=================="
echo

for cmd_file in "$TEST_DIR"/*.md; do
  cmd_name=$(basename "$cmd_file" .md)
  echo "Testing: $cmd_name"

  # Validate structure
  if ./validate-command.sh "$cmd_file"; then
    echo "  ✓ Structure valid"
  else
    echo "  ✗ Structure invalid"
    ((FAILED_TESTS++))
  fi

  # Validate frontmatter
  if ./validate-frontmatter.sh "$cmd_file"; then
    echo "  ✓ Frontmatter valid"
  else
    echo "  ✗ Frontmatter invalid"
    ((FAILED_TESTS++))
  fi

  echo
done

echo "=================="
echo "Tests complete"
echo "Failed: $FAILED_TESTS"

exit $FAILED_TESTS

Pre-Commit Hook

Validate commands before committing:

#!/bin/bash
# .git/hooks/pre-commit

echo "Validating commands..."

COMMANDS_CHANGED=$(git diff --cached --name-only | grep "\.claude/commands/.*\.md")

if [ -z "$COMMANDS_CHANGED" ]; then
  echo "No commands changed"
  exit 0
fi

for cmd in $COMMANDS_CHANGED; do
  echo "Checking: $cmd"

  if ! ./scripts/validate-command.sh "$cmd"; then
    echo "ERROR: Command validation failed: $cmd"
    exit 1
  fi
done

echo "✓ All commands valid"

Continuous Testing

Test commands in CI/CD:

# .github/workflows/test-commands.yml
name: Test Commands

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2

      - name: Validate command structure
        run: |
          for cmd in .claude/commands/*.md; do
            echo "Testing: $cmd"
            ./scripts/validate-command.sh "$cmd"
          done

      - name: Validate frontmatter
        run: |
          for cmd in .claude/commands/*.md; do
            ./scripts/validate-frontmatter.sh "$cmd"
          done

      - name: Check for TODOs
        run: |
          if grep -r "TODO" .claude/commands/; then
            echo "ERROR: TODOs found in commands"
            exit 1
          fi

Edge Case Testing

Test Edge Cases

Empty arguments:

> /cmd ""
> /cmd '' ''

Special characters:

> /cmd "arg with spaces"
> /cmd arg-with-dashes
> /cmd arg_with_underscores
> /cmd arg/with/slashes
> /cmd 'arg with "quotes"'

Long arguments:

> /cmd $(python -c "print('a' * 10000)")

Unusual file paths:

> /cmd ./file
> /cmd ../file
> /cmd ~/file
> /cmd "/path with spaces/file"

Bash command edge cases:

# Commands that might fail
!`exit 1`
!`false`
!`command-that-does-not-exist`

# Commands with special output
!`echo ""`
!`cat /dev/null`
!`yes | head -n 1000000`

Performance Testing

Response Time Testing

#!/bin/bash
# test-command-performance.sh

COMMAND="$1"

echo "Testing performance of /$COMMAND"
echo

for i in {1..5}; do
  echo "Run $i:"
  START=$(date +%s%N)

  # Invoke command (manual step - record time)
  echo "  Invoke: /$COMMAND"
  echo "  Start time: $START"
  echo "  (Record end time manually)"
  echo
done

echo "Analyze results:"
echo "  - Average response time"
echo "  - Variance"
echo "  - Acceptable threshold: < 3 seconds for fast commands"

Resource Usage Testing

# Monitor Claude Code during command execution
# In terminal 1:
claude --debug

# In terminal 2:
watch -n 1 'ps aux | grep claude'

# Execute command and observe:
# - Memory usage
# - CPU usage
# - Process count

User Experience Testing

Usability Checklist

Command name is intuitive
Description is clear in /help
Arguments are well-documented
Error messages are helpful
Output is formatted readably
Long-running commands show progress
Results are actionable
Edge cases have good UX

User Acceptance Testing

Recruit testers:

# Testing Guide for Beta Testers

## Command: /my-new-command

### Test Scenarios

1. **Basic usage:**
   - Run: `/my-new-command`
   - Expected: [describe]
   - Rate clarity: 1-5

2. **With arguments:**
   - Run: `/my-new-command arg1 arg2`
   - Expected: [describe]
   - Rate usefulness: 1-5

3. **Error case:**
   - Run: `/my-new-command invalid-input`
   - Expected: Helpful error message
   - Rate error message: 1-5

### Feedback Questions

1. Was the command easy to understand?
2. Did the output meet your expectations?
3. What would you change?
4. Would you use this command regularly?

Testing Checklist

Before releasing a command:

Structure

File in correct location
Correct .md extension
Valid YAML frontmatter (if present)
Markdown syntax correct

Functionality

Command appears in /help
Description is clear
Command executes without errors
Arguments work as expected
File references work
Bash execution works (if used)

Edge Cases

Missing arguments handled
Invalid arguments detected
Non-existent files handled
Special characters work
Long inputs handled

Integration

Works with other commands
Works with hooks (if applicable)
Works with MCP (if applicable)
State management works

Quality

Performance acceptable
No security issues
Error messages helpful
Output formatted well
Documentation complete

Distribution

Tested by others
Feedback incorporated
README updated
Examples provided

Debugging Failed Tests

Common Issues and Solutions

Issue: Command not appearing in /help

# Check file location
ls -la .claude/commands/my-command.md

# Check permissions
chmod 644 .claude/commands/my-command.md

# Check syntax
head -n 20 .claude/commands/my-command.md

# Restart Claude Code
claude --debug

Issue: Arguments not substituting

# Verify syntax
grep '\$1' .claude/commands/my-command.md
grep '\$ARGUMENTS' .claude/commands/my-command.md

# Test with simple command first
echo "Test: \$1 and \$2" > .claude/commands/test-args.md

Issue: Bash commands not executing

# Check allowed-tools
grep "allowed-tools" .claude/commands/my-command.md

# Verify command syntax
grep '!\`' .claude/commands/my-command.md

# Test command manually
date
echo "test"

Issue: File references not working

# Check @ syntax
grep '@' .claude/commands/my-command.md

# Verify file exists
ls -la /path/to/referenced/file

# Check permissions
chmod 644 /path/to/referenced/file

Best Practices

Test early, test often: Validate as you develop
Automate validation: Use scripts for repeatable checks
Test edge cases: Don't just test the happy path
Get feedback: Have others test before wide release
Document tests: Keep test scenarios for regression testing
Monitor in production: Watch for issues after release
Iterate: Improve based on real usage data

14 KiB Raw Permalink Blame History

Command Testing Strategies

Overview

Testing Levels

Level 1: Syntax and Structure Validation

Level 2: Frontmatter Field Validation

Level 3: Manual Command Invocation

Level 4: Argument Testing

Level 5: File Reference Testing

Level 6: Bash Execution Testing

Level 7: Integration Testing

Automated Testing Approaches

Command Test Suite

Pre-Commit Hook

Continuous Testing

Edge Case Testing

Test Edge Cases

Performance Testing

Response Time Testing

Resource Usage Testing

User Experience Testing

Usability Checklist

User Acceptance Testing

Testing Checklist

Structure

Functionality

Edge Cases

Integration

Quality

Distribution

Debugging Failed Tests

Common Issues and Solutions

Best Practices

14 KiB

Raw Permalink Blame History