⚖️Audit AI Models for Bias
Three fairness metrics. One model. They disagree. Walk a synthetic loan classifier through demographic parity, equalized odds, and calibration; see where they conflict; then outline a regulator-defensible audit plan for a resume screener.
Phase 1Three definitions of fair — and why they conflict
Three definitions of fair — and why they conflict
'Fair' is not one thing — it's at least three
7 min'Fair' is not one thing — it's at least three
Bias lives on axes you have to name out loud
7 minBias lives on axes you have to name out loud
The four-fifths rule is the regulator's first-pass test
6 minThe four-fifths rule is the regulator's first-pass test
Why you can't be fair by every metric at once
8 minWhy you can't be fair by every metric at once
Phase 2Compute three metrics; watch them disagree
Compute three metrics; watch them disagree
Build a synthetic loan dataset with a known bias
8 minBuild a synthetic loan dataset with a known bias
Measure selection-rate gap — the simplest fairness metric
7 minMeasure selection-rate gap — the simplest fairness metric
Measure error-rate gaps — the fairness metric regulators reach for
8 minMeasure error-rate gaps — the fairness metric regulators reach for
Measure calibration — does a predicted 0.8 mean the same thing for both groups?
8 minMeasure calibration — does a predicted 0.8 mean the same thing for both groups?
Watch all three metrics disagree on the same model
9 minWatch all three metrics disagree on the same model
Phase 3Pre-, in-, and post-processing mitigation
Pre-, in-, and post-processing mitigation
Your PM says 'just rebalance the training data'
7 minYour PM says 'just rebalance the training data'
The data scientist proposes a fairness constraint in the loss function
8 minThe data scientist proposes a fairness constraint in the loss function
Post-processing fixes the metric, but the lawyer asks 'is this legal?'
8 minPost-processing fixes the metric, but the lawyer asks 'is this legal?'
AIF360 or Aequitas — pick a toolkit and defend it
8 minAIF360 or Aequitas — pick a toolkit and defend it
Phase 4Outline a regulator-defensible audit plan
Outline a regulator-defensible audit plan
Outline a bias-audit plan for a hypothetical resume screener
10 minOutline a bias-audit plan for a hypothetical resume screener
Frequently asked questions
- What's the difference between demographic parity, equalized odds, and calibration?
- This is covered in the “Audit AI Models for Bias” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- Why can't a single model satisfy all three fairness metrics at once?
- This is covered in the “Audit AI Models for Bias” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- When should I use AIF360 vs Aequitas for fairness auditing?
- This is covered in the “Audit AI Models for Bias” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- What does a regulator-defensible AI bias audit actually include?
- This is covered in the “Audit AI Models for Bias” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- How do pre-, in-, and post-processing mitigations differ in practice?
- This is covered in the “Audit AI Models for Bias” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
Related paths
🐍Python Decorators Introduction
Build one mental model for Python decorators that covers closures, argument passing, functools.wraps, and stacking — then ship a working caching or logging decorator from scratch in under 30 lines.
🦀Rust Lifetimes Explained
Stop reading `'a` as line noise and start reading it as scope arithmetic — one failing snippet at a time — until you can thread lifetimes through a small parser or iterator adapter without fighting the borrow checker.
☸️Kubernetes Core Concepts
Stop drowning in 30+ resource types. Build the mental model one primitive at a time -- pods, deployments, services, ingress, config -- then deploy a real app with rolling updates and health checks.
📈Big O Intuition
Stop treating Big O as math you memorized for an interview — build the intuition to spot O(n²) disasters, pick the right data structure without thinking, and rewrite a slow function from O(n²) to O(n) in under five minutes.