🧭Understand ANN Algorithms: HNSW, IVF, PQ
Stop tuning ef and M by trial and error — see HNSW, IVF, and PQ as physical structures (a multilayer skip graph, a coarse cluster index, and a vector compressor) so you can predict which one fits a 100M-vector workload before you benchmark anything.
Phase 1Why Exact Nearest Neighbor Breaks
See why exact NN dies past 100K vectors
Exact nearest neighbor doesn't scale, and that's the whole story
5 minExact nearest neighbor doesn't scale, and that's the whole story
Recall is a knob, not a guarantee
6 minRecall is a knob, not a guarantee
Graph, cluster, compress — three ways to dodge O(n)
5 minGraph, cluster, compress — three ways to dodge O(n)
The three-axis budget every ANN tuning fight is really about
6 minThe three-axis budget every ANN tuning fight is really about
Phase 2Sketching HNSW, IVF, and PQ by Hand
Sketch HNSW, IVF, and PQ by hand
HNSW is a skip list pretending to be a graph
7 minHNSW is a skip list pretending to be a graph
Trace a query from top to bottom, by hand
7 minTrace a query from top to bottom, by hand
IVF is just k-means with an inverted file
6 minIVF is just k-means with an inverted file
PQ shrinks vectors 8x with almost no recall loss
7 minPQ shrinks vectors 8x with almost no recall loss
Real systems compose; pure HNSW or pure IVF is a starting point
7 minReal systems compose; pure HNSW or pure IVF is a starting point
Phase 3Choosing Indexes for Real Workloads
Compare recall, RAM, and update cost in real systems
Your HNSW recall dropped after a re-shard, and nobody knows why
7 minYour HNSW recall dropped after a re-shard, and nobody knows why
Your CFO wants a 60% cloud spend cut and your 50M-vector HNSW lives in RAM
8 minYour CFO wants a 60% cloud spend cut and your 50M-vector HNSW lives in RAM
Your e-commerce vectors update every minute and your IVF index is drifting
7 minYour e-commerce vectors update every minute and your IVF index is drifting
Your queries are 'find similar items under $50 in stock' and recall just collapsed
7 minYour queries are 'find similar items under $50 in stock' and recall just collapsed
Phase 4Designing for 100M Vectors with Updates
Pick the right index for a 100M-vector workload
Pick and defend an ANN design for a 100M-vector workload with updates
8 minPick and defend an ANN design for a 100M-vector workload with updates
Frequently asked questions
- What does 'approximate' actually trade away in ANN search?
- This is covered in the “Understand ANN Algorithms: HNSW, IVF, PQ” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- When should I pick HNSW over IVF for a vector database?
- This is covered in the “Understand ANN Algorithms: HNSW, IVF, PQ” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- How do ef and M parameters change HNSW recall and latency?
- This is covered in the “Understand ANN Algorithms: HNSW, IVF, PQ” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- Why does product quantization barely lose recall while cutting RAM 8x?
- This is covered in the “Understand ANN Algorithms: HNSW, IVF, PQ” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- Which ANN index fits a 100M-vector workload with frequent updates?
- This is covered in the “Understand ANN Algorithms: HNSW, IVF, PQ” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
Related paths
🐍Python Decorators Introduction
Build one mental model for Python decorators that covers closures, argument passing, functools.wraps, and stacking — then ship a working caching or logging decorator from scratch in under 30 lines.
🦀Rust Lifetimes Explained
Stop reading `'a` as line noise and start reading it as scope arithmetic — one failing snippet at a time — until you can thread lifetimes through a small parser or iterator adapter without fighting the borrow checker.
☸️Kubernetes Core Concepts
Stop drowning in 30+ resource types. Build the mental model one primitive at a time -- pods, deployments, services, ingress, config -- then deploy a real app with rolling updates and health checks.
📈Big O Intuition
Stop treating Big O as math you memorized for an interview — build the intuition to spot O(n²) disasters, pick the right data structure without thinking, and rewrite a slow function from O(n²) to O(n) in under five minutes.