Generate AI agent personas with authentic backstories, demographics, and personality traits for TraitorSim...
Generate complete AI agent personas with realistic backstories, demographics, and OCEAN personality traits for the TraitorSim game. This skill orchestrates the 5-stage pipeline that uses Gemini Deep Research for demographic grounding and Claude Opus for narrative synthesis.
# Generate 15 personas (estimated cost: $6-7)
./scripts/generate_persona_library.sh --count 15
# Or run stages individually:
python scripts/generate_skeleton_personas.py --count 15
python scripts/batch_deep_research.py --input data/personas/skeletons/test_batch_001.json
python scripts/poll_research_jobs.py --jobs data/personas/jobs/test_batch_001_jobs.json
python scripts/synthesize_backstories.py --reports data/personas/reports/test_batch_001_reports.json
python scripts/validate_personas.py --library data/personas/library/test_batch_001_personas.json
The persona generation pipeline consists of 5 stages:
Stage 1: Skeleton Generation
data/personas/skeletons/*.jsonStage 2: Deep Research Submission
data/personas/jobs/*_jobs.json (job IDs)Stage 3: Research Job Polling
data/personas/reports/*_reports.jsonStage 4: Backstory Synthesis
data/personas/library/*_personas.jsonStage 5: Validation
When asked to generate a complete persona library:
Determine batch size (recommend 15-20 for testing, 100+ for production)
Run the master orchestration script:
./scripts/generate_persona_library.sh --count 15
Monitor quota limits:
Verify results:
When asked to add more personas to an existing library:
Load existing personas:
with open('data/personas/library/test_batch_001_personas.json') as f:
existing_personas = json.load(f)
existing_ids = {p['skeleton_id'] for p in existing_personas}
Generate new skeletons (avoiding duplicate IDs)
Submit Deep Research jobs for new skeletons only
Synthesize only new personas:
# Extract only new reports
python -c "import json; ..." # Filter to new reports
# Synthesize
python scripts/synthesize_backstories.py --reports /tmp/new_reports_only.json --output /tmp/new_personas
# Merge
python -c "import json; ..." # Combine existing + new
Validate merged library
Deep Research API has rate limits (specific limits not publicly documented). Use these strategies:
Strategy 1: Wave Submission
# Submit in waves with delays
wave_sizes = [6, 4, 2, 2, 1] # Observed pattern
for wave_size in wave_sizes:
jobs = submit_batch(wave_size)
time.sleep(300) # 5 min between waves
Strategy 2: Client-Side Tracking
from scripts.quota_tracker import QuotaTracker
tracker = QuotaTracker(rpm_limit=10, rpd_limit=100)
if tracker.can_make_request():
submit_job()
tracker.record_request()
else:
wait_time = tracker.time_until_available()
time.sleep(wait_time)
Strategy 3: Exponential Backoff
for attempt in range(max_retries):
try:
submit_job()
break
except QuotaError:
wait_time = min(2 ** attempt * 60, 3600) # Max 1 hour
time.sleep(wait_time)
Ensure personas use in-universe brands (not real-world brands):
Approved In-Universe Brands:
Forbidden Real-World Brands:
Validation automatically detects forbidden brands using word boundary regex.
Per Persona:
Batch Estimates:
Cost Optimization:
13 predefined archetypes with distinct gameplay tendencies:
Each archetype has:
Personas must pass ALL validation checks:
Fail Fast: If validation fails, the pipeline stops immediately. No random fallback.
Each persona card contains:
{
"skeleton_id": "skeleton_010",
"archetype": "charismatic_leader",
"archetype_name": "The Charismatic Leader",
"name": "Gemma Ashworth-Clarke",
"demographics": {
"age": 31,
"occupation": "motivational speaker",
"location": "London",
"socioeconomic": "upper-middle",
"gender": "female"
},
"personality": {
"openness": 0.72,
"conscientiousness": 0.78,
"extraversion": 0.91,
"agreeableness": 0.65,
"neuroticism": 0.54
},
"stats": {
"intellect": 0.75,
"dexterity": 0.60,
"social_influence": 0.88
},
"backstory": "First-person narrative (200-300 words)...",
"key_relationships": ["..."],
"formative_challenge": "...",
"political_beliefs": "...",
"hobbies": ["...", "...", "..."],
"strategic_approach": "..."
}
Issue: Quota errors during Deep Research submission
Issue: Jobs stuck in "processing" status
Issue: Brand leakage detected
src/traitorsim/utils/world_flavor.py with new forbidden brandsIssue: Backstory too short/long
Issue: Re-synthesizing existing personas (wasting cost)
See docs/PERSONA_GENERATION_LESSONS.md for:
Test the pipeline with a small batch first:
# Generate 2 test personas
python scripts/generate_skeleton_personas.py --count 2 --output data/personas/skeletons/test_mini.json
python scripts/batch_deep_research.py --input data/personas/skeletons/test_mini.json --output data/personas/jobs/test_mini_jobs.json
# Monitor completion
python scripts/poll_research_jobs.py --jobs data/personas/jobs/test_mini_jobs.json --output data/personas/reports/test_mini_reports.json
# Synthesize
python scripts/synthesize_backstories.py --reports data/personas/reports/test_mini_reports.json --output data/personas/library/test_mini_personas.json --model claude-opus-4-5-20251101
# Validate
python scripts/validate_personas.py --library data/personas/library/test_mini_personas.json
Expected output: 100% validation pass, 0 brand leaks, ~$0.90 cost for 2 personas.
Environment Variables:
export GEMINI_API_KEY="..." # For Deep Research
export CLAUDE_CODE_OAUTH_TOKEN="..." # For Claude synthesis
Python Dependencies:
pip install google-genai claude-agent-sdk
File Structure:
data/personas/
├── skeletons/ # Stage 1 output
├── jobs/ # Stage 2 output (job IDs)
├── reports/ # Stage 3 output (research reports)
└── library/ # Stage 4 output (final personas)
scripts/
├── generate_skeleton_personas.py
├── batch_deep_research.py
├── poll_research_jobs.py
├── synthesize_backstories.py
└── validate_personas.py
A successful persona generation run should achieve:
Use this skill when:
Don't use this skill for: