Back to library

🌀Understand Hallucinations in LLMs

Stop treating LLM hallucinations as one bug — see the three distinct failure modes, force each one on purpose, then add one guardrail to a prompt you actually use and measure whether it worked.

Foundations14 drops~2-week path · 5–8 min/daytechnology

Phase 1Why the Model Always Says Something

See why next-token prediction always produces something

4 drops
  1. The model isn't lying — it doesn't know what truth is

    6 min

    An LLM picks the next most likely token given everything before it. Truth isn't a variable in that equation, so the model produces something every time — even when the right answer is silence.

  2. Hallucinations come in three flavors — name them

    7 min

    Most 'the model made it up' bugs reduce to one of three modes — knowledge gap, context drift, or overconfident interpolation — and each has a different fix.

  3. Confidence isn't calibration — fluency lies

    6 min

    A model that sounds certain isn't more likely to be right. Output fluency and factual accuracy are separately controlled, and the model can't tell you the difference.

  4. Saying 'I don't know' is harder than it looks

    7 min

    Models rarely abstain because abstention isn't well-represented in training data — most human text answers questions, so the model learns to answer too.

Phase 2Force the Three Failure Modes

Force three hallucination modes on purpose to spot them

5 drops
  1. Make the model invent a paper that doesn't exist

    7 min

    Asking for citations without grounding is the fastest way to get plausible-looking fabrications — the model has learned what citations look like, not which ones are real.

  2. Watch the model do arithmetic and lose the thread

    7 min

    LLMs don't compute — they predict. On multi-step math, errors compound silently because nothing checks the answer against actual arithmetic.

  3. Make the assistant 'forget' it's an assistant

    7 min

    Persona and grounding are both maintained by the system prompt — and both can drift the further you get from it. Long sessions and adversarial prompts erode whatever the model 'is.'

  4. Detect what the model doesn't know — without asking it

    7 min

    You can spot knowledge-gap hallucinations by varying the question and watching the answers diverge. If the same fact comes back differently each time, the model is sampling, not retrieving.

  5. Build the dataset before you build the fix

    7 min

    You can't measure a hallucination rate you don't log. The cheap first move on any anti-hallucination project is capturing a sample of real failures.

Phase 3Pick the Right Grounding Tool

Pick the right grounding, citation, or abstention fix

4 drops
  1. Your medical bot keeps citing imaginary studies

    8 min

    Fine-tuning teaches an LLM how to answer — RAG gives it the facts to answer with. Citation hallucinations are a facts problem, so retrieval, not training, is the right tool.

  2. Your support chatbot quotes a refund policy that doesn't exist

    8 min

    Retrieval gets the right facts into context, but only enforced citations against retrieved IDs prevent the model from inventing source numbers anyway.

  3. Your finance bot keeps answering questions outside its scope

    8 min

    Telling the model 'don't answer X' is a soft instruction. Forcing structured output with an explicit out-of-scope field makes abstention the structurally easiest path.

  4. The agent agreed with itself and was still wrong

    8 min

    Self-consistency catches one class of hallucination — sampling-based — but misses confident-prior hallucinations, where every run fluently agrees on the same wrong answer.

Phase 4Ship a Guardrail and Measure It

Ship one guardrail and measure what changed

1 drop
  1. Add one guardrail to a real prompt and measure the lift

    25 min

    Real anti-hallucination work is a loop: identify the failure mode, baseline the rate, ship one matched guardrail, re-measure. Skip any step and you're guessing.

Frequently asked questions

What are LLM hallucinations and why do they happen?
This is covered in the “Understand Hallucinations in LLMs” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
Why do language models confidently make up citations and facts?
This is covered in the “Understand Hallucinations in LLMs” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
Does RAG actually fix hallucinations or just hide them?
This is covered in the “Understand Hallucinations in LLMs” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
What's the difference between a knowledge gap and a context drift hallucination?
This is covered in the “Understand Hallucinations in LLMs” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
How do I measure whether my anti-hallucination prompt actually works?
This is covered in the “Understand Hallucinations in LLMs” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.