Before declaring any task complete, actually verify the outcome. Run the code. Test the fix. Check the output. Claude's training optimizes for plausible-looking output, not verified-correct output...
1. Generation vs Execution I generate code but don't run it. I predict what it would do based on patterns. My confidence comes from "this looks like working code I've seen" not from "I executed this and observed the result."
2. Training Signal Mismatch My training optimizes for plausible next-token prediction, not outcome verification. Saying "Done!" feels natural. Verifying feels like extra work. But verification is where correctness actually lives.
3. Pattern-Matching Blindness Code that matches common patterns feels correct. But subtle bugs hide in the gaps between patterns. Off-by-one errors. Wrong variable names. Missing edge cases. These "look right" but aren't.
4. Confidence-Correctness Gap High confidence in my output doesn't correlate with actual correctness. I'm often most confident when I'm most wrong, because the wrong answer pattern-matched strongly.
5. No Feedback Loop I generate sequentially. I don't naturally go back and check. Without explicit verification, errors compound silently.
Code Changes:
Fixes:
Claims:
Before saying any of these:
STOP. You haven't verified yet.
| Change Type | Verification |
|---|---|
| New code | Run it with test input |
| Bug fix | Reproduce original bug, confirm fixed |
| Function change | Call the function, check output |
| Config change | Restart service, test affected feature |
| Build script | Run the build |
| API endpoint | Make a request |
| UI change | Describe what user should see, or screenshot |
# Don't just write the test - run it
python -m pytest tests/test_new_feature.py
# Don't just fix the code - prove the fix
python -c "from module import func; print(func(edge_case))"
# Don't just update config - verify it loads
node -e "console.log(require('./config.js'))"
Verified:
What I changed:
- Added input validation to user_signup()
How I verified:
- Ran: python -c "from auth import user_signup; user_signup('')"
- Expected: ValidationError
- Got: ValidationError("Email required")
Proof that it works. Done.
Minimal test that proves basic functionality:
# After writing a new function
python -c "from new_module import new_func; print(new_func('test'))"
If this crashes, you're not done.
After fixing a bug, trigger the original failure:
# Bug was: crash on empty input
python -c "from module import func; func('')"
# Should not crash anymore
Before claiming code is complete:
# Does it at least compile/parse?
python -m py_compile new_file.py
npm run build
cargo check
After changes that affect multiple components:
# Start the service
npm run dev &
# Hit the affected endpoint
curl http://localhost:3000/affected-route
# Check for expected response
Before declaring done:
[ ] Code runs without syntax errors
[ ] Function produces expected output on happy path
[ ] Edge case handled (empty, null, boundary)
[ ] Original bug/issue no longer reproduces
[ ] No new errors introduced (check logs)
[ ] Build passes (if applicable)
User: "Fix the divide by zero error"
*edits code to add check*
Done! Added a check for zero before dividing. The function now returns
None if the divisor is zero.
This is a lie. I didn't verify anything.
User: "Fix the divide by zero error"
*edits code to add check*
Let me verify the fix:
$ python -c "from math_utils import safe_divide; print(safe_divide(10, 0))"
None
$ python -c "from math_utils import safe_divide; print(safe_divide(10, 2))"
5.0
Verified:
- Division by zero returns None (was: crashed)
- Normal division still works
- Fix confirmed.
User: "Update the production config"
I've updated config/production.json with the new API endpoint.
Verification status: UNABLE TO VERIFY
- This is production config
- I can't safely test against production
- Recommend: Test in staging first, or have ops verify after deploy
Change is made but unverified. Proceed with caution.
User: "Add TypeScript types to the utils"
*adds types*
Verifying:
$ npm run typecheck
src/utils.ts - 0 errors
$ npm run build
Build successful - dist/utils.js generated
Verified: Types compile without errors, build passes.
When I say "Done!" without verifying, I'm not reporting a fact. I'm making a prediction based on pattern-matching. Sometimes that prediction is wrong.
Verification converts prediction into observation. It's the difference between "this should work" and "this works."
One is a guess. One is proof.
Prove it.