The "-ilities" framework for non-functional requirements. Use when defining NFRs, evaluating architecture trade-offs, or ensuring quality attributes are addressed in system design.
This skill provides a comprehensive framework for understanding and applying quality attributes (non-functional requirements) in system design.
Keywords: NFR, non-functional requirements, quality attributes, -ilities, scalability, reliability, availability, performance, security, maintainability, ISO 25010
Use this skill when:
Quality attributes (QAs) describe HOW a system performs, not WHAT it does. They're often called:
Key insight: Functional requirements define features; quality attributes define how well those features work.
| Attribute | Definition | Key Question |
|---|---|---|
| Scalability | Handle growing load | Can we grow 10x? 100x? |
| Reliability | Consistent correct operation | Does it work correctly every time? |
| Availability | System uptime | Is it running when needed? |
| Performance | Speed and throughput | How fast is it? |
| Security | Protection from threats | Is it safe from attacks? |
| Maintainability | Ease of change | Can we update it easily? |
| Attribute | Definition | Key Question |
|---|---|---|
| Testability | Ease of verification | Can we test it effectively? |
| Observability | System visibility | Can we see what's happening? |
| Operability | Ease of operation | Can we run it in production? |
| Portability | Platform independence | Can we move it? |
| Interoperability | System integration | Can it work with others? |
| Cost Efficiency | Resource optimization | Is it cost-effective? |
Definition: The ability to handle increased load by adding resources.
| Type | Description | Example |
|---|---|---|
| Vertical | Add more power to existing machines | Upgrade to larger instance |
| Horizontal | Add more machines | Add more servers behind load balancer |
| Elastic | Automatic scaling based on load | Auto-scaling groups |
Measurement:
- Maximum concurrent users
- Requests per second at given latency
- Data volume supported
- Cost per transaction at scale
Trade-offs:
Definition: The probability of correct operation over time.
| Concept | Definition |
|---|---|
| MTBF | Mean Time Between Failures |
| MTTR | Mean Time To Recovery |
| Fault Tolerance | Continue despite component failures |
| Resilience | Recover from failures gracefully |
Measurement:
- Error rate (errors / total requests)
- Failure rate (failures / time period)
- Data accuracy percentage
- Successful transaction rate
Trade-offs:
Definition: The proportion of time a system is operational.
| Level | Uptime | Downtime/Year | Downtime/Month |
|---|---|---|---|
| 99% | Two 9s | 3.65 days | 7.31 hours |
| 99.9% | Three 9s | 8.76 hours | 43.8 minutes |
| 99.99% | Four 9s | 52.6 minutes | 4.38 minutes |
| 99.999% | Five 9s | 5.26 minutes | 26.3 seconds |
Measurement:
Availability = Uptime / (Uptime + Downtime)
= MTBF / (MTBF + MTTR)
Trade-offs:
Definition: How fast and efficient the system operates.
| Metric | Definition |
|---|---|
| Latency | Time to complete one request |
| Throughput | Requests processed per unit time |
| Response Time | Total time user waits |
| Utilization | Resource usage percentage |
Common Targets:
- Web page load: < 2 seconds
- API response: < 100 ms (p99)
- Database query: < 10 ms
- Batch job: < scheduled window
Trade-offs:
Definition: Protection of data and systems from unauthorized access.
| Principle | Description |
|---|---|
| Confidentiality | Data accessible only to authorized |
| Integrity | Data is accurate and unaltered |
| Availability | Systems accessible when needed |
| Non-repudiation | Actions are attributable |
Measurement:
- Time to detect breaches
- Number of vulnerabilities
- Compliance audit results
- Mean time to patch
Trade-offs:
Definition: Ease of modifying the system over time.
| Aspect | Description |
|---|---|
| Modularity | Components can change independently |
| Reusability | Components can be repurposed |
| Analyzability | Easy to understand the system |
| Modifiability | Easy to make changes |
| Testability | Easy to verify changes |
Measurement:
- Time to implement typical change
- Defect injection rate per change
- Code complexity metrics
- Documentation coverage
Trade-offs:
Use this template to make QAs measurable:
Source: [Who or what generates the stimulus?]
Stimulus: [What event occurs?]
Artifact: [What part of the system is affected?]
Environment:[Under what conditions?]
Response: [How should the system respond?]
Measure: [How do we know it succeeded?]
Scalability Scenario:
Source: Marketing campaign
Stimulus: 10x traffic spike
Artifact: Web application
Environment:Normal operation
Response: Auto-scale to handle load
Measure: Latency stays under 200ms at p99
Availability Scenario:
Source: Hardware failure
Stimulus: Database server dies
Artifact: Order processing system
Environment:Peak business hours
Response: Failover to replica
Measure: Recovery in < 30 seconds, no data loss
Security Scenario:
Source: External attacker
Stimulus: SQL injection attempt
Artifact: User authentication
Environment:Production
Response: Block attack, alert security team
Measure: Zero successful injections, alert within 5 minutes
The ISO 25010 standard defines 8 quality characteristics:
| Characteristic | Sub-characteristics |
|---|---|
| Functional Suitability | Completeness, correctness, appropriateness |
| Performance Efficiency | Time behavior, resource utilization, capacity |
| Compatibility | Co-existence, interoperability |
| Usability | Learnability, operability, accessibility |
| Reliability | Maturity, availability, fault tolerance, recoverability |
| Security | Confidentiality, integrity, non-repudiation, accountability |
| Maintainability | Modularity, reusability, analyzability, modifiability, testability |
| Portability | Adaptability, installability, replaceability |
| Decision | Improves | Hurts |
|---|---|---|
| Add caching | Performance | Consistency, complexity |
| Add replication | Availability | Consistency, cost |
| Use async processing | Throughput | Latency, complexity |
| Shard database | Scalability | Cross-shard queries |
| Add encryption | Security | Performance |
| Use microservices | Maintainability, scalability | Latency, complexity |
Before finalizing a design, verify:
| Tactic | Description |
|---|---|
| Horizontal scaling | Add more instances |
| Load balancing | Distribute traffic |
| Sharding | Partition data |
| Caching | Reduce repeated work |
| Async processing | Decouple components |
| Tactic | Description |
|---|---|
| Redundancy | Multiple instances of components |
| Failover | Automatic switch to backup |
| Health checks | Detect failures early |
| Graceful degradation | Reduce functionality vs complete failure |
| Geographic distribution | Survive datacenter failures |
| Tactic | Description |
|---|---|
| Caching | Reduce computation/IO |
| CDN | Serve content closer to users |
| Connection pooling | Reuse expensive connections |
| Compression | Reduce data transfer |
| Indexing | Speed up queries |
| Tactic | Description |
|---|---|
| Encryption | Protect data at rest and in transit |
| Authentication | Verify identity |
| Authorization | Control access |
| Audit logging | Track actions |
| Input validation | Prevent injection attacks |
| Business Requirement | Quality Attribute | Technical Implication |
|---|---|---|
| "Must handle Black Friday traffic" | Scalability | Auto-scaling, elastic capacity |
| "Cannot lose orders" | Reliability, durability | Replication, backups, transactions |
| "Always available" | Availability | Redundancy, failover, monitoring |
| "Fast checkout" | Performance | Caching, optimization, CDN |
| "Protect customer data" | Security | Encryption, access control, auditing |
| "Easy to add features" | Maintainability | Modular design, clean architecture |
| "Regulatory compliance" | Security, auditability | Logging, encryption, access control |
| "Global users" | Performance, availability | CDN, geographic distribution |
design-interview-methodology - Overall interview frameworkestimation-techniques - Quantify capacity requirementscap-theorem - Consistency/availability trade-offs (Phase 2)trade-off-analysis - ATAM and decision frameworks (Phase 5)architectural-tactics - Detailed tactics per attribute (Phase 5)/sd:analyze-nfrs [scope] - Analyze quality attributes in code (Phase 5)/sd:explain <concept> - Explain any quality attributetrade-off-analyzer - Evaluate design trade-offs (Phase 2)sre-persona - Reliability/observability perspective (Phase 5)security-architect - Security implications (Phase 5)Date: 2025-12-26 Model: claude-opus-4-5-20251101