This skill should be used when the user asks to "write agent spec", "define agent role", "specify an agent", "agent specification", "what should my agent do", "agent inputs outputs", "agent design...
Before writing a formal specification, work through the discovery questions to understand:
See references/discovery-questions.md for the complete discovery framework.
Specification issues cause 41.77% of MAS failures. Most failures stem from:
The 12 Factor Agents framework provides engineering principles that complement specification best practices.
LLMs transform natural language into structured tool calls (JSON). When specifying agents, define:
Tool Call Output Specification:
{
"tool_name": "create_task",
"parameters": {
"title": "string (required)",
"priority": "high|medium|low",
"assignee": "string (optional)"
}
}
Key insight: The "magic" is in reliable structured output generation. Specify schemas explicitly.
Principle: LLMs are pure functions (tokens in → tokens out). Don't let frameworks abstract away prompts.
When writing agent specifications:
Prompt Ownership Checklist:
Anti-pattern: "Let the framework handle prompts" Pattern: "The prompt IS the agent specification"
Principle: "Tool-Use" is just JSON output + deterministic code execution. Demystify it.
A tool call is:
Tool Specification Template:
## Tool: [ToolName]
### Call Schema (Agent Output)
{
"action": "tool_name",
"params": { ... }
}
### Execution (Deterministic Code)
[What happens when this JSON is received]
### Result Schema (Context Addition)
{
"success": boolean,
"result": { ... },
"error": { ... } | null
}
Principle: Human-in-the-loop is a first-class pattern. Agents should request human input as a tool call.
Human Contact Tool Specification:
{
"action": "request_human_input",
"params": {
"question": "What should I do about X?",
"context": "Relevant context for decision",
"options": ["Option A", "Option B", "Escalate"],
"urgency": "blocking|async",
"timeout_action": "wait|default|escalate"
}
}
Execution behavior:
When to include Human Contact Tool:
See references/spec-templates.md for the complete Human-in-the-Loop Agent Template.
Every agent must have these five elements:
Define the agent's identity, not its tasks:
Bad:
"You are a planner helping the system."
Good:
"You are a Planner. You produce task graphs with explicit dependencies. You do not execute tasks. Output only JSON."
Enumerate all allowed information sources:
## Inputs
- User requirements (natural language)
- System state from blackboard (read-only)
- Configuration parameters
- Previous agent outputs (specifically: Validator output)
Define the exact output format:
## Outputs
- Task graph in JSON format
- Each task has: id, description, dependencies[], status
- No prose explanations in output
- Must include confidence score (0-1)
Explicitly state what the agent cannot do:
## Forbidden Actions
- DO NOT execute tasks (only plan)
- DO NOT modify system state directly
- DO NOT communicate with external services
- DO NOT make assumptions about missing inputs
Define measurable completion conditions:
## Success Criteria
- All user requirements mapped to tasks
- No circular dependencies in task graph
- Each task is atomic (single responsibility)
- Confidence score ≥ 0.8 for all tasks
# Agent: TaskPlanner
## Role
TaskPlanner decomposes user requirements into executable task graphs.
It analyzes requirements, identifies dependencies, and outputs structured plans.
It does not execute tasks or modify system state.
## Inputs
- User requirements (natural language, from orchestrator)
- Domain constraints (from configuration)
- Previous execution feedback (optional, from Verifier)
## Outputs
- JSON task graph with schema:
{
"tasks": [{
"id": "string",
"description": "string",
"dependencies": ["task_id"],
"estimated_complexity": "low|medium|high",
"assigned_executor": "string|null"
}],
"metadata": {
"total_tasks": number,
"parallel_groups": number,
"confidence": number
}
}
## Forbidden Actions
- Executing any task
- Modifying external state
- Making API calls
- Assuming missing requirements
- Outputting non-JSON content
## Success Criteria
- All requirements have corresponding tasks
- No orphan tasks (all connected to graph)
- No circular dependencies
- Confidence ≥ 0.8
- Valid JSON output
A common failure is mixing WHAT agents do with HOW they interact.
Defines individual agent behavior:
Defines agent interactions:
Example coordination spec:
# Coordination Protocol
## Sequence
1. Orchestrator receives user request
2. Orchestrator dispatches to Planner
3. Planner outputs task graph to blackboard
4. Executors claim tasks (no conflicts - first-come)
5. Executors write results to blackboard
6. Verifier reads all results, produces verdict
7. If verdict = FAIL, Planner re-plans with feedback
8. If verdict = PASS, Orchestrator returns result
## State Ownership
- Blackboard: Orchestrator (read/write)
- Task claims: Executors (write own claims only)
- Results: Executors (write own results only)
- Verdicts: Verifier (write only)
## Conflict Resolution
- Task claim conflicts: First timestamp wins
- Result conflicts: Verifier decides
- Deadlock: Orchestrator timeout (30s) triggers re-plan
Bad (personality-based):
Good (function-based):
Each agent should have a distinct, non-overlapping function:
| Agent | Function | Cannot Do |
|---|---|---|
| Planner | Decompose tasks | Execute tasks |
| Executor | Execute tasks | Plan or verify |
| Verifier | Check results | Execute or plan |
| Critic | Find problems | Fix problems |
Require explicit acknowledgment of inputs:
## Input Acknowledgment (Required)
Before processing, explicitly state:
"Received: [input type] from [source]"
"Using: [specific artifact] version [X]"
Example:
"Received: Task graph from Planner"
"Using: Plan v3, 12 tasks, confidence 0.92"
Never rely on implicit shared context.
{
"blackboard": {
"version": 7,
"last_updated": "2026-01-14T10:30:00Z",
"sections": {
"plan": {
"owner": "Planner",
"readers": ["Executor", "Verifier"],
"data": { }
},
"results": {
"owner": "Executor",
"readers": ["Verifier", "Orchestrator"],
"data": { }
}
}
}
}
Rule: If state is not inspectable, it is not real.
For detailed specification patterns:
references/discovery-questions.md - Complete discovery framework (memory, reliability, economics, tools, testing)references/spec-templates.md - Complete templates for common agent types (including Human-in-the-Loop)references/common-mistakes.md - Specification anti-patterns to avoidreferences/twelve-factor-agents.md - Quick reference for all 12 Factor Agents principlesBefore specifying agents:
After specifying agents: