π²Understand Temperature, Top-P, and Sampling
See exactly what temperature and top-p do to a model's probability distribution, then justify the sampling settings for your real tasks instead of guessing. Stop tweaking knobs and start engineering output behavior.
Phase 1The Distribution Behind Every Token
See next-token prediction as a real probability distribution
The model never picks one word β it ranks all of them
6 minThe model never picks one word β it ranks all of them
Logits are the model's raw vote β softmax is the ballot
7 minLogits are the model's raw vote β softmax is the ballot
Temperature 0 isn't deterministic β it's just greedy
6 minTemperature 0 isn't deterministic β it's just greedy
Every sampling parameter answers one question: which tail to trust
7 minEvery sampling parameter answers one question: which tail to trust
Phase 2Watching Softmax Bend Under Heat
Walk softmax through temperatures and watch curves flatten
Cold temperatures crush the tail and worship the peak
6 minCold temperatures crush the tail and worship the peak
Temperature 0.7 keeps the shape but lets the tail breathe
6 minTemperature 0.7 keeps the shape but lets the tail breathe
High temperatures flatten the distribution into noise
7 minHigh temperatures flatten the distribution into noise
Top-k draws a fixed line β and that's both its strength and its flaw
6 minTop-k draws a fixed line β and that's both its strength and its flaw
Top-p adapts to confidence β keeps a few tokens or many
7 minTop-p adapts to confidence β keeps a few tokens or many
Phase 3Picking the Right Sampler for the Job
Choose between top-k, top-p, and min-p deliberately
You're extracting an email β but the model returns three different ones across runs
7 minYou're extracting an email β but the model returns three different ones across runs
You're brainstorming product names and getting the same five every time
7 minYou're brainstorming product names and getting the same five every time
Your code generation is technically syntactic β and subtly wrong
7 minYour code generation is technically syntactic β and subtly wrong
Min-p cuts based on the peak β fixes top-p's edge cases
7 minMin-p cuts based on the peak β fixes top-p's edge cases
Phase 4Defending Your Sampling Choices
Lock in sampling choices for three real tasks
Pick three real tasks and lock in defensible sampling configs
20 minPick three real tasks and lock in defensible sampling configs
Frequently asked questions
- What is the difference between temperature and top-p in LLMs?
- This is covered in the βUnderstand Temperature, Top-P, and Samplingβ learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- Should I use temperature or top-p for creative writing?
- This is covered in the βUnderstand Temperature, Top-P, and Samplingβ learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- Why does temperature 0 not always give deterministic output?
- This is covered in the βUnderstand Temperature, Top-P, and Samplingβ learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- What does top-p (nucleus sampling) actually do?
- This is covered in the βUnderstand Temperature, Top-P, and Samplingβ learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- When should I use min-p instead of top-p?
- This is covered in the βUnderstand Temperature, Top-P, and Samplingβ learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
Related paths
πPython Decorators Introduction
Build one mental model for Python decorators that covers closures, argument passing, functools.wraps, and stacking β then ship a working caching or logging decorator from scratch in under 30 lines.
π¦Rust Lifetimes Explained
Stop reading `'a` as line noise and start reading it as scope arithmetic β one failing snippet at a time β until you can thread lifetimes through a small parser or iterator adapter without fighting the borrow checker.
βΈοΈKubernetes Core Concepts
Stop drowning in 30+ resource types. Build the mental model one primitive at a time -- pods, deployments, services, ingress, config -- then deploy a real app with rolling updates and health checks.
πBig O Intuition
Stop treating Big O as math you memorized for an interview β build the intuition to spot O(nΒ²) disasters, pick the right data structure without thinking, and rewrite a slow function from O(nΒ²) to O(n) in under five minutes.