Execute approved sprint plans with test-driven development, continuous linting, progress tracking, and pause points.
Execute an approved sprint plan with continuous progress tracking, testing, and documentation updates. Supports parallel execution of independent milestones using Task sub-agents for faster sprints.
Sequential execution (default):
# User says: "Execute the sprint plan in design_docs/20251019/M-S1.md"
# This skill will:
# 1. Validate prerequisites (tests pass, linting clean)
# 2. Create TodoWrite tasks for all milestones
# 3. Execute each milestone with test-driven development
# 4. Run checkpoint after each milestone (tests + lint)
# 5. Update CHANGELOG and sprint plan progressively
# 6. Pause after each milestone for user review
Parallel execution (for independent milestones):
# User says: "Execute sprint plan at docs/sprint-plans/M-FOO.md in parallel"
# This skill will:
# 1. Read sprint plan and identify all milestones
# 2. Analyze dependencies — group independent milestones for parallel execution
# 3. Spawn Task sub-agents per milestone (branch, TDD, implement, commit)
# 4. Act as integration agent — merge branches, run full test suite
# 5. Write sprint retrospective with timing and friction analysis
Invoke this skill when:
Choose parallel mode when:
Choose sequential mode when:
When invoked by the AILANG Coordinator (detected by GitHub issue reference in the prompt), you MUST output these markers at the end of your response:
IMPLEMENTATION_COMPLETE: true
BRANCH_NAME: coordinator/task-XXXX
FILES_CREATED: file1.go, file2.go
FILES_MODIFIED: file3.go, file4.go
Why? The coordinator uses these markers to:
Example completion:
## Implementation Complete
All milestones have been completed and tests pass.
**IMPLEMENTATION_COMPLETE**: true
**BRANCH_NAME**: `coordinator/task-abc123`
**FILES_CREATED**: `internal/new_file.go`, `internal/new_test.go`
**FILES_MODIFIED**: `internal/existing.go`
IsPure: true) MUST be tested with -count=20 — single-pass tests hide Go map iteration nondeterminism. Use realistic inputs, not toy examples. See milestone_checklist.md for details.Sprint execution can now span multiple Claude Code sessions!
Based on Anthropic's long-running agent patterns, sprint-executor implements the "Coding Agent" pattern:
Session Startup Routine: Every session starts with session_start.sh
.ailang/state/sprints/sprint_<id>.json)Structured Progress Tracking: JSON file tracks state
passes: true/false/null (follows "constrained modification" pattern)Pause and Resume: Work can be interrupted at any time
not_started, in_progress, paused, completedFor JSON schema details, see resources/json_progress_schema.md
scripts/session_start.sh <sprint_id> NEWResume sprint execution across multiple sessions.
When to use: ALWAYS at the start of EVERY session continuing a sprint.
What it does:
ailang messages import-githubscripts/validate_prerequisites.shValidate prerequisites before starting sprint execution.
What it checks:
ailang messages import-githubscripts/validate_sprint_json.sh <sprint_id> NEWREQUIRED before starting any sprint. Validates that sprint JSON has real milestones (not placeholders).
What it checks:
MILESTONE_ID)Criterion 1/2)Exit codes:
0 - Valid JSON, ready for execution1 - Invalid JSON or placeholders detected (sprint-planner must fix)scripts/milestone_checkpoint.sh <milestone_name> [sprint_id]Run checkpoint after completing a milestone.
CRITICAL: Tests passing ≠ Feature working! This script verifies with REAL DATA.
What it does:
make test and make lintSprint-specific verification (e.g., M-TASK-HIERARCHY):
Example:
# Verify M1 (should pass if entity sync works)
.claude/skills/sprint-executor/scripts/milestone_checkpoint.sh M1 M-TASK-HIERARCHY
# Verify M2 (will fail if OTEL attributes not propagated)
.claude/skills/sprint-executor/scripts/milestone_checkpoint.sh M2 M-TASK-HIERARCHY
Exit codes:
0 - Checkpoint passed (all verifications succeeded)1 - Checkpoint FAILED (DO NOT mark milestone complete)scripts/acceptance_test.sh <milestone_id> <test_type> NEWRun end-to-end acceptance tests (parser, builtin, examples, REPL, e2e).
scripts/finalize_sprint.sh <sprint_id> [version] NEWFinalize a completed sprint by moving design docs and updating status.
What it does:
planned/ to implemented/<version>/implemented/<version>/When to use: After all milestones pass and sprint is complete.
Example:
.claude/skills/sprint-executor/scripts/finalize_sprint.sh M-BUG-RECORD-UPDATE-INFERENCE v0_4_9
If this is NOT the first session for this sprint:
# ALWAYS run session_start.sh first!
.claude/skills/sprint-executor/scripts/session_start.sh <sprint-id>
This prints "Here's where we left off" summary. Then skip to Phase 2 to continue with the next milestone.
validate_sprint_json.sh <sprint-id> REQUIRED FIRST.ailang/state/sprints/sprint_<id>.json)validate_prerequisites.sh (tests, linting, git status)After Phase 1 initialization, choose between sequential or parallel execution based on milestone dependencies.
Decision criteria:
For each milestone:
in_progress in TodoWritemilestone_checkpoint.sh <milestone-name> (tests + lint must pass)changelogs/ — find active file with ls changelogs/ | grep currentexamples/runnable/<feature>.ailexamples/manifest.json if adding examplespasses: true/false in .ailang/state/sprints/sprint_<id>.jsoncompleted: "<ISO timestamp>"notes: "<summary of what was done>"After all milestones complete, proceed to Phase 4: Finalize Sprint.
Quick tips:
internal/parser/test_helpers.goDEBUG_PARSER=1 for token flow tracingmake doc PKG=<package> for API discoveryUse this mode when independent milestones can be developed concurrently using Task sub-agents.
Read the sprint plan and build a dependency graph from the milestones:
For each milestone M:
- Parse M.dependencies[] from sprint JSON
- Identify files M will touch (use Grep to scope relevant files)
- Check for file-level conflicts between milestones
Group milestones into parallelizable waves:
Wave 1: [M1, M3, M5] ← no dependencies, no shared files
Wave 2: [M2, M4] ← depend on Wave 1 results
Wave 3: [M6] ← depends on Wave 2 results
Report the dependency graph to the user before proceeding:
Dependency Analysis:
Wave 1 (parallel): M1 (parser), M3 (stdlib), M5 (docs)
Wave 2 (parallel): M2 (depends on M1), M4 (depends on M3)
Wave 3 (sequential): M6 (depends on M2 + M4)
Estimated speedup: ~2.5x over sequential execution
Proceed with parallel execution? [y/n]
For each parallelizable wave, spawn one Task sub-agent per milestone in a single message (so they execute concurrently). Each sub-agent receives a detailed prompt:
# Example: spawning Wave 1 in parallel (all in ONE message with multiple Task calls)
Task(
description=f"Sprint milestone {milestone.id}",
subagent_type="general-purpose",
prompt=f"""
You are executing milestone {milestone.id}: {milestone.description}
Sprint plan: {plan_path}
## Your Scope
Branch: sprint/{milestone_slug}
Relevant files (from dependency analysis): {milestone.relevant_files}
## MANDATORY: Test-Driven Development
**Step 1: Create branch**
```bash
git checkout -b sprint/{milestone_slug}
```
**Step 2: Write FAILING tests FIRST**
For each acceptance criterion in the milestone:
- Write a test that captures the expected behavior
- Run `make test` — confirm the new tests FAIL
- Do NOT proceed to implementation until you have failing tests
Acceptance criteria:
{milestone.acceptance_criteria}
**Step 3: Implement until all tests pass**
- Write the minimum code to make each failing test pass
- Run `make test` after each change
- Run `make lint` — fix any issues immediately
- Keep functions small and focused (<50 lines)
**Step 4: Commit with milestone reference**
```bash
git add <specific-files>
git commit -m "Complete {milestone.id}: {milestone.description}
Acceptance criteria met:
{acceptance_criteria_checklist}
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>"
```
**Step 5: Report back**
At the END of your response, output this EXACT format:
```
MILESTONE_REPORT:
milestone_id: {milestone.id}
status: success|failure
tests_passed: <count>
tests_failed: <count>
files_changed: <comma-separated list>
branch: sprint/{milestone_slug}
notes: <brief summary of what was done>
```
## RULES
- Do NOT skip the failing-test-first step
- Do NOT commit with failing tests
- Do NOT modify files outside your scope: {milestone.relevant_files}
- If blocked, report status: failure with clear explanation
"""
)
Critical rules for sub-agent spawning:
Task callssprint/<milestone-slug>)Grep to verify they're only reading relevant files before editingAfter all sub-agents in a wave complete, parse their MILESTONE_REPORT blocks:
{
"wave": 1,
"results": [
{"milestone_id": "M1", "status": "success", "tests_passed": 12, "tests_failed": 0, "branch": "sprint/m1-parser-fix"},
{"milestone_id": "M3", "status": "success", "tests_passed": 8, "tests_failed": 0, "branch": "sprint/m3-stdlib-list"},
{"milestone_id": "M5", "status": "failure", "tests_passed": 3, "tests_failed": 2, "branch": "sprint/m5-docs"}
]
}
If any milestone in a wave failed:
If all milestones in a wave succeeded:
Execute waves sequentially (Wave 1 → integrate → Wave 2 → integrate → ...). Within each wave, milestones run in parallel.
After parallel milestones complete, act as the integration agent.
This phase merges all milestone branches and verifies the combined result. The executor (you) performs this directly — do NOT delegate integration to sub-agents.
# Start from the base branch (dev or main)
git checkout dev
git checkout -b sprint/integration
Merge each successful milestone branch into sprint/integration, one at a time:
# Merge in dependency order (Wave 1 first, then Wave 2, etc.)
git merge sprint/m1-parser-fix --no-ff -m "Integrate M1: parser fix"
git merge sprint/m3-stdlib-list --no-ff -m "Integrate M3: stdlib list"
git merge sprint/m5-docs --no-ff -m "Integrate M5: docs"
If a merge conflict occurs:
make test after resolution to verify nothing brokemake test # ALL tests must pass
make lint # ALL linting must pass
make fmt-check # Formatting must be clean
If integration tests fail:
git bisect or selective reverts:# Revert last merge to isolate
git revert -m 1 HEAD
make test
# If tests pass now, the reverted milestone caused the failure
After successful integration, update .ailang/state/sprints/sprint_<id>.json:
passes: true for each integrated milestoneactual_loc from git diff statscompleted timestampsstatus: "in_progress" (or "completed" if all waves done)After ALL waves are integrated and tests pass, write a retrospective.
Create docs/sprint-retros/<sprint-id>-retro.md with:
# Sprint Retrospective: <sprint-id>
## Summary
- **Sprint**: <sprint-id>
- **Duration**: <actual days> (estimated: <planned days>)
- **Execution mode**: Parallel (Wave count: N)
- **Total milestones**: N (passed: X, failed: Y)
## Milestone Timing
| Milestone | Estimated LOC | Actual LOC | Time Taken | Status |
|-----------|--------------|------------|------------|--------|
| M1 | 200 | 185 | 45 min | ✅ |
| M2 | 150 | 210 | 1h 10min | ✅ |
| ... | | | | |
## Parallelization Results
- **Waves executed**: 3
- **Max parallelism**: 3 agents (Wave 1)
- **Speedup vs sequential**: ~2.1x
- **Integration conflicts**: 1 (M1 + M3 shared `types.go`)
## Friction Encountered
- <Description of any blockers, unexpected complexity, or tooling gaps>
- <Merge conflicts and how they were resolved>
- <Sub-agent failures and root causes>
## Recommendations for Next Sprint
- <Suggestions for improving parallelization>
- <Files that should NOT be parallelized (high conflict risk)>
- <DX improvements identified during execution>
Note: Documentation verification, design doc moves, and quality checks are handled by the sprint-evaluator in Phase 5. The executor focuses on implementation — the evaluator judges it.
CRITICAL: After finalizing a sprint, ALWAYS hand off to sprint-evaluator for independent quality assessment.
Based on Anthropic's generator-evaluator architecture — separating the agent doing the work from the agent judging it is "a strong lever" for quality.
This is the standard workflow:
Send handoff message:
ailang messages send sprint-evaluator '{
"type": "implementation_complete",
"correlation_id": "eval_<sprint-id>_<date>",
"sprint_id": "<sprint-id>",
"branch_name": "<branch>",
"sprint_json_path": ".ailang/state/sprints/sprint_<id>.json",
"design_doc_path": "<path to design doc>",
"files_created": [...],
"files_modified": [...],
"evaluation_round": 1
}' --title "Sprint <sprint-id> ready for evaluation" --from "sprint-executor"
Why separate evaluation?
If session starts with an evaluation_feedback message:
ailang messages read MSG_IDmilestone_checkpoint.sh for affected milestonesfinalize_sprint.shmake test after every file changemake lint after implementationmake fmt for formattingUses ailang messages for GitHub sync and issue tracking!
Automatic sync:
session_start.sh and validate_prerequisites.sh run ailang messages import-github firstIf github_issues is set in sprint JSON:
validate_sprint_json.sh shows linked issuessession_start.sh displays issue titles from messagesmilestone_checkpoint.sh reminds you to include Refs #... in commitsfinalize_sprint.sh suggests commit message with issue referencesCommit message format:
# During development - use "refs" to LINK without closing
git commit -m "Complete M1: Parser foundation, refs #17"
# Final sprint commit - use "Fixes" to AUTO-CLOSE issues on merge
git commit -m "Finalize sprint M-BUG-FIX
Fixes #17
Fixes #42"
Important: "refs" vs "Fixes"
refs #17 - Links commit to issue (NO auto-close)Fixes #17, Closes #17, Resolves #17 - AUTO-CLOSES issue when mergedWorkflow:
github_issues: [17, 42] (set by sprint-planner, deduplicated)refs #17 to link commits without closingFixes #17 to auto-close issues on mergeailang messages import-github checks existing issues before importingresources/json_progress_schema.md - Sprint progress formatsession_start.sh ALWAYS for continuing sprintsresources/parser_patterns.md - Parser development + pattern matching pipelineresources/codegen_patterns.md - Go code generation for new features/builtinsresources/api_patterns.md - Common constructor signatures and API gotchasresources/developer_tools.md - Make targets, ailang commands, workflowsresources/dx_improvement_patterns.md - Identifying and implementing DX winsresources/dx_quick_reference.md - ROI calculator, decision matrixresources/milestone_checklist.md - Step-by-step per milestoneThis skill loads information progressively:
scripts/ directory (validation, checkpoints, testing)resources/ directory (detailed guides, patterns, references)dev (or specified in sprint plan)scripts/validate_sprint_json.sh <sprint-id>make fmtMILESTONE_REPORT from the failed sub-agentsprint/integrationmake test after each conflict resolutionThe sprint-executor skill integrates with the AILANG Coordinator for automated workflows.
When configured in ~/.ailang/config.yaml, the sprint-executor agent:
coordinator:
agents:
- id: sprint-executor
inbox: sprint-executor
workspace: /path/to/ailang
capabilities: [code, test, docs]
trigger_on_complete: [sprint-evaluator] # Evaluator judges implementation
auto_approve_handoffs: false
auto_merge: false
session_continuity: true
max_concurrent_tasks: 1
The sprint-executor receives:
{
"type": "plan_ready",
"correlation_id": "sprint_M-CACHE_20251231",
"sprint_id": "M-CACHE",
"plan_path": "design_docs/planned/v0_6_3/m-cache-sprint-plan.md",
"progress_path": ".ailang/state/sprints/sprint_M-CACHE.json",
"session_id": "claude-session-xyz",
"estimated_duration": "3 days",
"total_loc_estimate": 650
}
With session_continuity: true:
session_id from sprint-planner handoff--resume SESSION_ID for Claude Code CLIWith auto_merge: false:
As the last agent in the chain:
trigger_on_complete: [] means no automatic handoffstatus: completedimplemented/ directory.ailang/state/sprints/sprint_<id>.jsonsprint/integrationdocs/sprint-retros/ after parallel sprints complete