📊Understand Confusion Matrices, Precision, and Recall
Stop reaching for accuracy by reflex — read a confusion matrix in seconds, compute precision and recall by hand, and pick the right metric for spam, fraud, and cancer-screening problems without second-guessing.
Phase 1Reading the Confusion Matrix
See why one accuracy number hides the real story
A 99% accurate cancer detector that catches zero cancers
6 minA 99% accurate cancer detector that catches zero cancers
Every classifier mistake fits in one of four boxes
6 minEvery classifier mistake fits in one of four boxes
False positives and false negatives don't cost the same thing
6 minFalse positives and false negatives don't cost the same thing
Precision asks 'when I said yes, was I right?' Recall asks 'did I catch them all?'
7 minPrecision asks 'when I said yes, was I right?' Recall asks 'did I catch them all?'
Phase 2Computing the Metrics by Hand
Compute precision, recall, and F1 on tiny datasets
Twenty emails, four minutes, every metric on paper
7 minTwenty emails, four minutes, every metric on paper
F1 is the harmonic mean — it punishes the weaker number
7 minF1 is the harmonic mean — it punishes the weaker number
Move the threshold and watch precision and recall trade places
7 minMove the threshold and watch precision and recall trade places
You can't have perfect precision and perfect recall at the same time
7 minYou can't have perfect precision and perfect recall at the same time
When 1% of your data is the class you care about, accuracy is poison
7 minWhen 1% of your data is the class you care about, accuracy is poison
Phase 3Choosing Metrics in Real Domains
Choose precision or recall by the cost of mistakes
The product manager wants 'fewer spam emails' — what do you actually optimize?
7 minThe product manager wants 'fewer spam emails' — what do you actually optimize?
The screen you can't afford to miss tells a different story
7 minThe screen you can't afford to miss tells a different story
When false alarms have a real customer-facing cost, the math gets cost-weighted
7 minWhen false alarms have a real customer-facing cost, the math gets cost-weighted
Pick the curve that doesn't lie about the rare class
7 minPick the curve that doesn't lie about the rare class
Phase 4Picking the Right Metric for Three Real Problems
Pick the right metric for three real problems
Defend a metric for spam, fraud, and screening — in writing
8 minDefend a metric for spam, fraud, and screening — in writing
Frequently asked questions
- What is the difference between precision and recall?
- This is covered in the “Understand Confusion Matrices, Precision, and Recall” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- When should I optimize for precision vs recall?
- This is covered in the “Understand Confusion Matrices, Precision, and Recall” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- Why is accuracy a bad metric for imbalanced data?
- This is covered in the “Understand Confusion Matrices, Precision, and Recall” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- What does the F1 score actually measure?
- This is covered in the “Understand Confusion Matrices, Precision, and Recall” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- When should I use a PR curve instead of an ROC curve?
- This is covered in the “Understand Confusion Matrices, Precision, and Recall” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
Related paths
🐍Python Decorators Introduction
Build one mental model for Python decorators that covers closures, argument passing, functools.wraps, and stacking — then ship a working caching or logging decorator from scratch in under 30 lines.
🦀Rust Lifetimes Explained
Stop reading `'a` as line noise and start reading it as scope arithmetic — one failing snippet at a time — until you can thread lifetimes through a small parser or iterator adapter without fighting the borrow checker.
☸️Kubernetes Core Concepts
Stop drowning in 30+ resource types. Build the mental model one primitive at a time -- pods, deployments, services, ingress, config -- then deploy a real app with rolling updates and health checks.
📈Big O Intuition
Stop treating Big O as math you memorized for an interview — build the intuition to spot O(n²) disasters, pick the right data structure without thinking, and rewrite a slow function from O(n²) to O(n) in under five minutes.