Back to library

🐛Debug Code with LLMs

Stop chasing the first plausible theory the AI offers. By the end you'll run a real debugging loop — hypothesis, counter-evidence, smallest test — with the LLM as your partner instead of your oracle.

Foundations14 drops~2-week path · 5–8 min/daytechnology

Phase 1Why LLMs Are Confidently Wrong About Bugs

See why AI is confidently wrong about your bugs

4 drops
  1. The LLM's first theory is the trap

    6 min

    An LLM trained on Stack Overflow answers will pattern-match your stack trace to the most common cause — which is almost never your cause.

  2. Symptoms are noise — state is signal

    6 min

    A stack trace tells the model where it failed; the actual variable values tell the model why it failed. Most pastes are 90% noise, 10% signal.

  3. Ask for hypotheses, never for a fix

    6 min

    When you ask 'what's wrong?' the model commits to one answer. When you ask 'give me three competing hypotheses,' it surfaces the space and you stay in control.

  4. The smallest test beats the smartest theory

    6 min

    Once you have three hypotheses, the bottleneck isn't more thinking — it's running the cheapest experiment that distinguishes them.

Phase 2Forcing Hypotheses Instead of Fixes

Force three hypotheses, then run the cheapest test

5 drops
  1. Lead with what you already ruled out

    7 min

    Telling the model what you've already tried is more valuable than telling it what's broken — it forces the model out of its default answer space.

  2. Force the model to disagree with itself

    7 min

    Asking for three hypotheses 'ordered by likelihood, with one that contradicts the obvious explanation' breaks the model out of its default mode.

  3. Ask the model to argue against its own answer

    7 min

    After the model picks a hypothesis, asking 'what evidence would make this wrong?' converts confidence into testable claims.

  4. The smallest broken example does the work

    8 min

    Reducing your bug to a 20-line repro forces you to find the cause yourself — and gives the LLM something it can actually reason about.

  5. One question, one test, one update

    7 min

    The AI debugging loop runs on a tight cadence: one prompt, one runnable test, then update the model with what you saw. Long conversations without intermediate evidence are a smell.

Phase 3Grounding the Model in Real Evidence

Ground the model with repros, profilers, and git blame

4 drops
  1. A 3 AM 500 error nobody can repro

    7 min

    When you can't reproduce a bug locally, the AI has nothing to ground on — and your job is to make the model carry the production state, not your code.

  2. The endpoint got 4× slower in last week's release

    8 min

    Performance bugs need profiler output, not source code — the model can't guess where the time goes, but it can read a flame graph.

  3. The bug was introduced six months ago and nobody noticed

    8 min

    Some bugs hide for months because they only fire under conditions that just started occurring — and git blame plus a bisect tells the model when the latent fault landed.

  4. It only fails in CI, never locally

    8 min

    Bugs that fire only in CI are environmental fingerprints — the AI needs both environments described, not just the failing one, to spot the diff that matters.

Phase 4Solving a Planted Bug End-to-End

Solve a planted bug end-to-end with an AI bisect loop

1 drop
  1. Run the full bisect/print/repro loop on a planted bug

    8 min

    Run the full bisect/print/repro loop on a planted bug

Frequently asked questions

Why does ChatGPT confidently suggest the wrong fix for my bug?
This is covered in the “Debug Code with LLMs” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
How do I ask an LLM for debugging help without getting hallucinated answers?
This is covered in the “Debug Code with LLMs” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
What information should I paste alongside a stack trace?
This is covered in the “Debug Code with LLMs” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
Should I let the AI write the fix or just explain the bug?
This is covered in the “Debug Code with LLMs” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
How is AI-augmented debugging different from rubber-duck debugging?
This is covered in the “Debug Code with LLMs” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.