What is the difference between temperature and top-p in LLMs?

This is covered in the "Understand Temperature, Top-P, and Sampling" learning path on Droplet. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.

Should I use temperature or top-p for creative writing?

This is covered in the "Understand Temperature, Top-P, and Sampling" learning path on Droplet. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.

Why does temperature 0 not always give deterministic output?

This is covered in the "Understand Temperature, Top-P, and Sampling" learning path on Droplet. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.

What does top-p (nucleus sampling) actually do?

This is covered in the "Understand Temperature, Top-P, and Sampling" learning path on Droplet. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.

When should I use min-p instead of top-p?

This is covered in the "Understand Temperature, Top-P, and Sampling" learning path on Droplet. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.

Back to library

🎲Understand Temperature, Top-P, and Sampling

See exactly what temperature and top-p do to a model's probability distribution, then justify the sampling settings for your real tasks instead of guessing. Stop tweaking knobs and start engineering output behavior.

Applied14 drops~2-week path · 5–8 min/daytechnology

Phase 1The Distribution Behind Every Token

See next-token prediction as a real probability distribution

4 drops

The model never picks one word — it ranks all of them
6 min
The model never picks one word — it ranks all of them
Logits are the model's raw vote — softmax is the ballot
7 min
Logits are the model's raw vote — softmax is the ballot
Temperature 0 isn't deterministic — it's just greedy
6 min
Temperature 0 isn't deterministic — it's just greedy
Every sampling parameter answers one question: which tail to trust
7 min
Every sampling parameter answers one question: which tail to trust

Phase 2Watching Softmax Bend Under Heat

Walk softmax through temperatures and watch curves flatten

5 drops

Cold temperatures crush the tail and worship the peak
6 min
Cold temperatures crush the tail and worship the peak
Temperature 0.7 keeps the shape but lets the tail breathe
6 min
Temperature 0.7 keeps the shape but lets the tail breathe
High temperatures flatten the distribution into noise
7 min
High temperatures flatten the distribution into noise
Top-k draws a fixed line — and that's both its strength and its flaw
6 min
Top-k draws a fixed line — and that's both its strength and its flaw
Top-p adapts to confidence — keeps a few tokens or many
7 min
Top-p adapts to confidence — keeps a few tokens or many

Phase 3Picking the Right Sampler for the Job

Choose between top-k, top-p, and min-p deliberately

4 drops

You're extracting an email — but the model returns three different ones across runs
7 min
You're extracting an email — but the model returns three different ones across runs
You're brainstorming product names and getting the same five every time
7 min
You're brainstorming product names and getting the same five every time
Your code generation is technically syntactic — and subtly wrong
7 min
Your code generation is technically syntactic — and subtly wrong
Min-p cuts based on the peak — fixes top-p's edge cases
7 min
Min-p cuts based on the peak — fixes top-p's edge cases

Phase 4Defending Your Sampling Choices

Lock in sampling choices for three real tasks

1 drop

Pick three real tasks and lock in defensible sampling configs
20 min
Pick three real tasks and lock in defensible sampling configs

Frequently asked questions

What is the difference between temperature and top-p in LLMs?: This is covered in the “Understand Temperature, Top-P, and Sampling” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
Should I use temperature or top-p for creative writing?: This is covered in the “Understand Temperature, Top-P, and Sampling” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
Why does temperature 0 not always give deterministic output?: This is covered in the “Understand Temperature, Top-P, and Sampling” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
What does top-p (nucleus sampling) actually do?: This is covered in the “Understand Temperature, Top-P, and Sampling” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
When should I use min-p instead of top-p?: This is covered in the “Understand Temperature, Top-P, and Sampling” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.

🎲Understand Temperature, Top-P, and Sampling

Phase 1The Distribution Behind Every Token

The model never picks one word — it ranks all of them

Logits are the model's raw vote — softmax is the ballot

Temperature 0 isn't deterministic — it's just greedy

Every sampling parameter answers one question: which tail to trust

Phase 2Watching Softmax Bend Under Heat

Cold temperatures crush the tail and worship the peak

Temperature 0.7 keeps the shape but lets the tail breathe

High temperatures flatten the distribution into noise

Top-k draws a fixed line — and that's both its strength and its flaw

Top-p adapts to confidence — keeps a few tokens or many

Phase 3Picking the Right Sampler for the Job

You're extracting an email — but the model returns three different ones across runs

You're brainstorming product names and getting the same five every time

Your code generation is technically syntactic — and subtly wrong

Min-p cuts based on the peak — fixes top-p's edge cases

Phase 4Defending Your Sampling Choices

Pick three real tasks and lock in defensible sampling configs

Frequently asked questions

🐍Python Decorators Introduction

🦀Rust Lifetimes Explained

☸️Kubernetes Core Concepts

📈Big O Intuition

Phase 1The Distribution Behind Every Token

The model never picks one word — it ranks all of them

Logits are the model's raw vote — softmax is the ballot

Temperature 0 isn't deterministic — it's just greedy

Every sampling parameter answers one question: which tail to trust

Phase 2Watching Softmax Bend Under Heat

Cold temperatures crush the tail and worship the peak

Temperature 0.7 keeps the shape but lets the tail breathe

High temperatures flatten the distribution into noise

Top-k draws a fixed line — and that's both its strength and its flaw

Top-p adapts to confidence — keeps a few tokens or many

Phase 3Picking the Right Sampler for the Job

You're extracting an email — but the model returns three different ones across runs

You're brainstorming product names and getting the same five every time

Your code generation is technically syntactic — and subtly wrong

Min-p cuts based on the peak — fixes top-p's edge cases

Phase 4Defending Your Sampling Choices

Pick three real tasks and lock in defensible sampling configs

Frequently asked questions

Related paths

🐍Python Decorators Introduction

🦀Rust Lifetimes Explained

☸️Kubernetes Core Concepts

📈Big O Intuition