Smithery Logo
MCPsSkillsDocsPricing
Login
Smithery Logo

Give agents more agency

Resources

DocumentationPrivacy PolicySystem Status

Company

PricingAboutBlog

Connect

© 2026 Smithery. All rights reserved.

    spacholski1225

    characterization-testing

    spacholski1225/characterization-testing
    Coding
    2

    About

    SKILL.md

    Install

    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    • Download skill
    ├─
    ├─
    └─

    About

    Create tests that describe what legacy code actually does (not what it should do) as safety net before refactoring

    SKILL.md

    Characterization Testing

    Overview

    Characterization tests capture current behavior of legacy code, warts and all. They're a safety net before refactoring, not a specification of correctness.

    Core principle: Document what IS, not what SHOULD BE. Fix behavior later, after safety net exists.

    This is NOT unit testing. Unit tests specify desired behavior. Characterization tests document actual behavior.

    When to Use

    Use characterization testing when:

    • Legacy code has no automated tests
    • Unclear what code is supposed to do
    • Before refactoring risky/critical areas
    • Documentation doesn't match reality
    • Need safety net without understanding all edge cases

    Don't use when:

    • Code already has comprehensive tests
    • You're implementing new features (use TDD instead)
    • Code is so simple testing is unnecessary

    The Iron Law

    NO REFACTORING WITHOUT CHARACTERIZATION TESTS FIRST
    

    Refactoring without tests = gambling with production. Always create safety net first.

    The Process

    Step 1: Identify Target

    Choose smallest meaningful unit to characterize:

    • Single function/method (best starting point)
    • Single class (if functions are tightly coupled)
    • Module (if class boundaries unclear)

    Start small. You can always expand coverage later.

    Step 2: Write Failing Test

    Write test with unknown expectation:

    test('processes user data', () => {
      const result = processUserData({ name: 'John', age: 30 });
      expect(result).toEqual(/* ??? what does it return? */);
    });
    

    Don't guess. Leave expectation blank or use placeholder.

    Step 3: Run and Capture

    Run the test. It will fail. Copy the actual output exactly:

    $ npm test
    FAIL: expected ???, received { fullName: 'John', isAdult: true, category: 'standard' }
    

    This is the characterization: what the code actually does right now.

    Step 4: Lock In Behavior

    Update test with actual output:

    test('processes user data', () => {
      const result = processUserData({ name: 'John', age: 30 });
      expect(result).toEqual({
        fullName: 'John',
        isAdult: true,
        category: 'standard'
      });
    });
    

    Run test again → should pass. You've characterized the behavior.

    Step 5: Add Edge Cases

    Find weird inputs and capture outputs:

    test('handles missing age', () => {
      const result = processUserData({ name: 'John' });
      // Run test, see what happens, lock it in
      expect(result).toEqual({
        fullName: 'John',
        isAdult: false,
        category: 'unknown'
      });
    });
    
    test('handles negative age (current behavior - BUG)', () => {
      const result = processUserData({ name: 'John', age: -5 });
      // This is wrong but it's what code does now
      expect(result).toEqual({
        fullName: 'John',
        isAdult: true,  // BUG: negative age treated as adult!
        category: 'standard'
      });
    });
    
    test('handles empty name', () => {
      const result = processUserData({ name: '', age: 30 });
      expect(result).toEqual({
        fullName: '',
        isAdult: true,
        category: 'standard'
      });
    });
    
    test('handles null input', () => {
      // Might throw error, might return null - capture what happens
      expect(() => processUserData(null)).toThrow('Cannot read property');
    });
    

    Key insight: You're documenting bugs, not fixing them. Tests show what code does, including incorrect behavior.

    Step 6: Document Known Issues

    Mark tests for known bugs:

    test.skip('FIXME: should reject negative age', () => {
      // This is what SHOULD happen (not what happens now)
      expect(() => processUserData({ name: 'John', age: -5 }))
        .toThrow('Invalid age: must be non-negative');
    });
    
    test('handles negative age (CURRENT BEHAVIOR - BUG)', () => {
      // This is what ACTUALLY happens now
      const result = processUserData({ name: 'John', age: -5 });
      expect(result.isAdult).toBe(true); // Wrong! But it's current behavior
    });
    

    Why both tests?

    • .skip test shows desired behavior (for future)
    • Active test locks in current behavior (prevents regressions during refactoring)

    Step 7: Verify Coverage

    Ensure main execution paths covered:

    • Happy path (valid inputs)
    • Edge cases (empty, null, undefined, zero, negative)
    • Boundary values (max/min for your domain)
    • Error cases (invalid inputs, external failures)

    Not 100% code coverage. Focus on behavior coverage: scenarios that matter.

    Checklist

    • Identified smallest testable unit
    • Wrote test with unknown expectation (???)
    • Ran test and captured actual output
    • Locked in current behavior (test passes)
    • Added edge cases (empty, null, invalid, boundary values)
    • Documented known bugs with comments
    • Created .skip tests for desired behavior (future fixes)
    • All tests pass (green for current behavior)
    • Tests cover main execution paths

    Example: Full Workflow

    Legacy code we need to refactor:

    function calculateDiscount(user, cart) {
      let total = 0;
      for (let i = 0; i < cart.items.length; i++) {
        total += cart.items[i].price * cart.items[i].quantity;
      }
    
      if (user.isPremium) {
        total = total * 0.9;
      }
    
      if (cart.items.length > 5) {
        total = total * 0.95;
      }
    
      return Math.round(total * 100) / 100;
    }
    

    Characterization tests:

    describe('calculateDiscount - characterization', () => {
      test('standard user, small cart', () => {
        const user = { isPremium: false };
        const cart = {
          items: [
            { price: 10, quantity: 2 },
            { price: 5, quantity: 1 }
          ]
        };
    
        const result = calculateDiscount(user, cart);
        expect(result).toBe(25); // 10*2 + 5*1 = 25
      });
    
      test('premium user gets 10% discount', () => {
        const user = { isPremium: true };
        const cart = { items: [{ price: 100, quantity: 1 }] };
    
        const result = calculateDiscount(user, cart);
        expect(result).toBe(90); // 100 * 0.9 = 90
      });
    
      test('more than 5 items gets additional 5% discount', () => {
        const user = { isPremium: false };
        const cart = {
          items: Array(6).fill({ price: 10, quantity: 1 })
        };
    
        const result = calculateDiscount(user, cart);
        expect(result).toBe(57); // 60 * 0.95 = 57
      });
    
      test('premium + bulk discounts stack (CURRENT BEHAVIOR)', () => {
        const user = { isPremium: true };
        const cart = {
          items: Array(6).fill({ price: 10, quantity: 1 })
        };
    
        const result = calculateDiscount(user, cart);
        expect(result).toBe(51.3); // 60 * 0.9 * 0.95 = 51.3
      });
    
      test('empty cart returns 0', () => {
        const user = { isPremium: false };
        const cart = { items: [] };
    
        const result = calculateDiscount(user, cart);
        expect(result).toBe(0);
      });
    
      test('missing isPremium field (CURRENT BEHAVIOR - BUG?)', () => {
        const user = {}; // no isPremium field
        const cart = { items: [{ price: 100, quantity: 1 }] };
    
        const result = calculateDiscount(user, cart);
        expect(result).toBe(100); // Falsy check treats missing as non-premium
      });
    
      test('null user throws error', () => {
        const cart = { items: [{ price: 100, quantity: 1 }] };
    
        expect(() => calculateDiscount(null, cart))
          .toThrow("Cannot read property 'isPremium' of null");
      });
    });
    

    Now safe to refactor! If refactoring breaks these tests, you've changed behavior (maybe accidentally).

    Anti-Patterns

    ❌ Fixing Bugs While Characterizing

    Bad:

    test('negative price should be rejected', () => {
      expect(() => calculateDiscount(user, { items: [{ price: -10, quantity: 1 }] }))
        .toThrow('Invalid price');
    });
    

    This is what SHOULD happen, not what DOES happen. You're writing specification, not characterization.

    Good:

    test('negative price (CURRENT BEHAVIOR - BUG)', () => {
      const user = { isPremium: false };
      const cart = { items: [{ price: -10, quantity: 1 }] };
    
      const result = calculateDiscount(user, cart);
      expect(result).toBe(-10); // Bug: negative total! But this is current behavior
    });
    
    test.skip('FIXME: negative price should be rejected', () => {
      // This is desired future behavior
      expect(() => calculateDiscount(user, { items: [{ price: -10, quantity: 1 }] }))
        .toThrow('Invalid price');
    });
    

    ❌ Refactoring Before Tests

    Bad:

    1. Look at legacy code
    2. "This is messy, let me clean it up"
    3. Refactor
    4. Add tests
    

    Good:

    1. Look at legacy code
    2. Add characterization tests
    3. Verify tests pass
    4. Refactor with confidence
    5. Tests still pass → safe refactoring
    

    ❌ Mocking Everything

    Bad:

    test('calls database with correct params', () => {
      const mockDB = jest.fn();
      processUserData(mockDB, user);
      expect(mockDB).toHaveBeenCalledWith('users', { id: 123 });
    });
    

    This tests interactions, not behavior. You don't know what the function returns.

    Good:

    test('processes user data from database', () => {
      // Use real database or test database
      const result = processUserData({ id: 123 });
      expect(result).toEqual({ name: 'John', email: 'john@example.com' });
    });
    

    Characterization tests should test real behavior with real dependencies when possible.

    ❌ Skipping "Embarrassing" Bugs

    Bad:

    // I found this bug but I'm not going to test it because it's embarrassing
    

    Good:

    test('allows XSS in user input (CURRENT BEHAVIOR - SECURITY BUG)', () => {
      const result = renderUserProfile({ name: '<script>alert("xss")</script>' });
      expect(result).toContain('<script>alert("xss")</script>');
      // Bug exists! But test documents it so we can fix it later
    });
    

    Document all bugs, especially security issues. Better to know than to be surprised.

    Common Rationalizations

    Excuse Reality
    "Code is too complex to test" Characterization tests don't need full understanding. Capture behavior empirically.
    "I'll refactor, then add tests" Refactoring without tests = hoping you didn't break anything. Tests first.
    "Tests will take too long" Hours of characterization vs days of debugging production. Tests are faster.
    "I know what the code should do" Great! But what does it actually do? They might differ.
    "I'll just be careful" You will miss edge cases. Tests catch what you forget.
    "Bugs are embarrassing to test" Documented bugs can be fixed. Hidden bugs cause incidents.

    After Characterization

    Now you have safety net. Next steps:

    1. Refactor with confidence - Tests catch if you break something
    2. Fix bugs one at a time - Update characterization test to desired behavior
    3. Add unit tests - For new features, use TDD going forward
    4. Remove characterization tests - Once you have proper unit tests covering behavior

    Characterization tests are temporary. They're scaffolding for refactoring, not permanent test suite.

    Integration with Other Skills

    • skills/analysis/code-archaeology - Understand code before characterizing
    • skills/refactoring/strangler-fig-pattern - Replace characterized code safely
    • skills/testing/test-driven-development - Add new features with TDD after characterization
    • skills/refactoring/seam-finding - Find boundaries for characterization
    • skills/safety/approval-testing - Alternative for complex outputs

    Remember

    • Characterization tests document what IS, not what SHOULD BE
    • Run test → capture output → lock it in
    • Document bugs, don't fix them (yet)
    • Tests are safety net for refactoring
    • NO REFACTORING without characterization tests first
    • Characterization tests are temporary scaffolding
    Recommended Servers
    OpenZeppelin
    OpenZeppelin
    URL Safety Validator MCP
    URL Safety Validator MCP
    Cosmetic Regulatory Intelligence
    Cosmetic Regulatory Intelligence
    Repository
    spacholski1225/cc-config
    Files