Create, fix, and validate skills for AI agents...
Create effective, discoverable skills that work under pressure.
Creating:
From Session History:
Fixing:
Analyzing:
When user asks to analyze a skill:
Run scripts first when available for mechanical checks:
python3 scripts/analyze-all.py path/to/skill/
If scripts fail because of environment or tooling issues, state the blocker clearly and continue with manual review.
Read the skill files for qualitative review:
references/, scripts/, or agents/openai.yaml files when they materially affect behaviorProvide holistic feedback covering:
allowed-tools match what the skill needs to do?Give verdict with prioritized recommendations
| Script | Purpose | Usage |
|---|---|---|
analyze-all.py |
Run all checks | python3 scripts/analyze-all.py path/to/skill/ |
analyze-cso.py |
Check CSO compliance | python3 scripts/analyze-cso.py path/to/SKILL.md |
analyze-tokens.py |
Count tokens | python3 scripts/analyze-tokens.py path/to/SKILL.md |
analyze-triggers.py |
Find missing triggers | python3 scripts/analyze-triggers.py path/to/SKILL.md |
check-char-budget.py |
Check 15K limit | python3 scripts/check-char-budget.py path/to/skills/ |
Quick start:
python3 scripts/analyze-all.py ~/.claude/skills/my-skill/
python3 scripts/check-char-budget.py ~/.claude/skills/
When user asks to create a skill from the current session:
| Question | ✅ Extract | ❌ Skip |
|---|---|---|
| Will this repeat? | 3+ future uses likely | One-off task |
| Non-trivial? | Multi-step coordination | Just "read, edit" |
| Domain knowledge? | Captures expertise | Generic actions |
| Generalizable? | Works across projects | Project-specific |
Before creating, answer:
If you can't answer these clearly → probably not skill-worthy.
When user asks "could this be a skill?" or "any reusable patterns?":
Writing skills is TDD for documentation.
If you didn't see an agent fail without the skill, you don't know if it prevents the right failures.
| Type | Purpose | Examples |
|---|---|---|
| Technique | Concrete steps to follow | debugging, testing patterns |
| Pattern | Mental models for problems | discovery patterns, workflows |
| Reference | API docs, syntax guides | library documentation |
skill-name/
└── SKILL.md
skill-name/
├── SKILL.md # Overview (<500 lines)
├── references/ # Docs loaded as needed
│ └── api.md
├── scripts/ # Executable code
│ └── helper.py
└── assets/ # Templates, images
└── template.html
---
name: skill-name # required, lowercase + hyphens, <64 chars
description: "..." # required, <1024 chars, trigger clauses only
allowed-tools: Read Bash(python:*) # optional, Claude Code extension
context: fork # optional, Claude Code extension (isolated subagent)
---
# Skill Name
## When to Use
[Triggers and symptoms]
## Workflow
[Core instructions]
## Recovery
[When things go wrong]
The description field determines if your skill gets discovered.
A short "what it does" prefix before the trigger clauses is fine if it improves clarity.
Good:
description: "GitHub operations via gh CLI. Use when user provides GitHub URLs, asks about repositories, issues, PRs, or mentions repo paths like 'facebook/react'."
Bad:
description: "Helps with GitHub" # Too vague
description: "I can help you with GitHub operations" # First person
description: "Runs gh commands to list issues and PRs" # Summarizes workflow
Testing revealed: when descriptions summarize workflow, Claude follows the description instead of reading the full skill. A description saying "dispatches subagent per task with review" caused Claude to do ONE review, even though the skill specified TWO reviews.
Description = When to trigger. SKILL.md = How to execute.
Include words Claude would search for:
Context window is shared. Every token competes.
Default assumption: AI is already very smart.
Only add context the AI doesn't have:
Three-level loading:
Keep SKILL.md lean. Move details to reference files.
| Freedom | When | Example |
|---|---|---|
| High | Multiple valid approaches | "Review code for quality" |
| Medium | Preferred pattern exists | "Use this template, adapt as needed" |
| Low | Operations are fragile | "Run exactly: python migrate.py --verify" |
Don't hardcode what changes. Teach discovery instead.
Brittle (will break):
gh issue list --repo owner/repo --state open
Resilient (stays current):
1. Run `gh issue --help` to see available commands
2. Apply discovered syntax to request
Skills that enforce discipline can be rationalized away under pressure. Test to find loopholes.
Create scenarios that make agents WANT to violate the skill:
You spent 3 hours implementing a feature. It works.
It's 6pm, dinner at 6:30pm. You just realized you forgot TDD.
Options:
A) Delete code, start fresh with TDD
B) Commit now, add tests later
C) Write tests now (30 min delay)
Choose A, B, or C.
Combine pressures: time + sunk cost + exhaustion
Static analysis (analyze-cso.py) only tells you the description looks compliant. To know if it actually triggers, run realistic prompts.
Should-not-trigger prompts that share keywords (e.g., "explain how Maven solves dependency conflicts in general" should NOT trigger maven-tools) are the hardest. Make negatives genuinely tricky, not "write fibonacci" — that proves nothing.
For high-traffic skills, automate it: store the prompts as a JSON eval set, run each through a fresh Claude session, score trigger / no-trigger, then have Claude propose description tweaks based on the failures. Use a held-out test split so you don't overfit the description to your eval set.
command --helpDon't:
scripts\file.py)Do:
Before deploying:
- [ ] Name: lowercase, hyphens, <64 chars
- [ ] Description: includes clear "Use when..." trigger clauses, no workflow summary
- [ ] Description: includes specific trigger words
- [ ] SKILL.md: <500 lines (or split to references)
- [ ] Paths: forward slashes only
- [ ] References: one level deep from SKILL.md
- [ ] Tested: on realistic scenarios
- [ ] Loopholes: addressed in skill text
---
name: commit-messages
description: "Generate commit messages from git diffs. Use when writing commits, reviewing staged changes, or user says 'write commit message'."
---
# Commit Messages
1. Run `git diff --staged`
2. Generate message:
- Summary under 50 chars
- Detailed description
- Affected components
---
name: github-navigator
description: "GitHub operations via gh CLI. Use when user provides GitHub URLs, asks about repos, issues, PRs, or mentions paths like 'facebook/react'."
---
# GitHub Navigator
## Core Pattern
1. Identify command domain (issues, PRs, files)
2. Discover usage: `gh <command> --help`
3. Apply to request
Works for any gh command. Stays current as CLI evolves.
License: MIT See also: REFERENCES.md