Llm Cost Optimization
Skill Profile
(Select at least one profile to enable specific modules)
Overview
Artificial Intelligence, specifically Large Language Models (LLMs), introduces a new variable cost component to software engineering: Inference tokens. Unlike traditional API costs which are often fixed or volume-tiered, LLM costs scale linearly with usage, context length, and model complexity.
Core Principle: "Spend on intelligence where it matters, optimize where it doesn't."
Why This Matters
- Cost Visibility: Understand LLM cost drivers
- Optimization: Reduce token waste
- Performance: Balance cost and quality
- Architecture: Design cost-effective AI systems
Core Concepts & Rules
1. Core Principles
- Follow established patterns and conventions
- Maintain consistency across codebase
- Document decisions and trade-offs
2. Implementation Guidelines
- Start with the simplest viable solution
- Iterate based on feedback and requirements
- Test thoroughly before deployment
Inputs / Outputs / Contracts
- Inputs:
- LLM usage data
- Token counts and costs
- Prompt and response logs
- RAG system metrics
- Entry Conditions:
- LLM provider selected
- Usage tracking enabled
- Cost monitoring configured
- Outputs:
- Cost analysis and reports
- Optimization recommendations
- Budget alerts
- Attribution data
- Artifacts Required (Deliverables):
- Cost dashboards
- Optimization plans
- Budget controls
- Attribution reports
- Acceptance Evidence:
- LLM costs are tracked
- Optimization implemented
- Budgets are enforced
- Success Criteria:
- Cost savings > 30%
- Token efficiency > 20%
- Budget compliance > 95%
Skill Composition
- Depends on: Cloud Cost Models, Cost Observability
- Compatible with: Budget Guardrails, Cost Modeling
- Conflicts with: Systems without LLM cost tracking
- Related Skills:
Quick Start / Implementation Example
- Review requirements and constraints
- Set up development environment
- Implement core functionality following patterns
- Write tests for critical paths
- Run tests and fix issues
- Document any deviations or decisions
# Example implementation following best practices
def example_function():
# Your implementation here
pass
Assumptions / Constraints / Non-goals
- Assumptions:
- Development environment is properly configured
- Required dependencies are available
- Team has basic understanding of domain
- Constraints:
- Must follow existing codebase conventions
- Time and resource limitations
- Compatibility requirements
- Non-goals:
- This skill does not cover edge cases outside scope
- Not a replacement for formal training
Compatibility & Prerequisites
- Supported Versions:
- Python 3.8+
- Node.js 16+
- Modern browsers (Chrome, Firefox, Safari, Edge)
- Required AI Tools:
- Code editor (VS Code recommended)
- Testing framework appropriate for language
- Version control (Git)
- Dependencies:
- Language-specific package manager
- Build tools
- Testing libraries
- Environment Setup:
.env.example keys: API_KEY, DATABASE_URL (no values)
Test Scenario Matrix (QA Strategy)
| Type |
Focus Area |
Required Scenarios / Mocks |
| Unit |
Core Logic |
Must cover primary logic and at least 3 edge/error cases. Target minimum 80% coverage |
| Integration |
DB / API |
All external API calls or database connections must be mocked during unit tests |
| E2E |
User Journey |
Critical user flows to test |
| Performance |
Latency / Load |
Benchmark requirements |
| Security |
Vuln / Auth |
SAST/DAST or dependency audit |
| Frontend |
UX / A11y |
Accessibility checklist (WCAG), Performance Budget (Lighthouse score) |
Technical Guardrails & Security Threat Model
1. Security & Privacy (Threat Model)
- Top Threats: Injection attacks, authentication bypass, data exposure
2. Performance & Resources
3. Architecture & Scalability
4. Observability & Reliability
Agent Directives & Error Recovery
(ข้อกำหนดสำหรับ AI Agent ในการคิดและแก้ปัญหาเมื่อเกิดข้อผิดพลาด)
- Thinking Process: Analyze root cause before fixing. Do not brute-force.
- Fallback Strategy: Stop after 3 failed test attempts. Output root cause and ask for human intervention/clarification.
- Self-Review: Check against Guardrails & Anti-patterns before finalizing.
- Output Constraints: Output ONLY the modified code block. Do not explain unless asked.
Definition of Done (DoD) Checklist
Anti-patterns / Pitfalls
- ⛔ Don't: Log PII, catch-all exception, N+1 queries
- ⚠️ Watch out for: Common symptoms and quick fixes
- 💡 Instead: Use proper error handling, pagination, and logging
Reference Links & Examples
- Internal documentation and examples
- Official documentation and best practices
- Community resources and discussions
Versioning & Changelog
- Version: 1.0.0
- Changelog:
- 2026-02-22: Initial version with complete template structure