What's the difference between semantic, instance, and panoptic segmentation?

This is covered in the "Understand Image Segmentation with SAM" learning path on Droplet. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.

What is the Segment Anything Model (SAM) and why is it called a foundation model?

This is covered in the "Understand Image Segmentation with SAM" learning path on Droplet. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.

How do you prompt SAM — points, boxes, or text?

This is covered in the "Understand Image Segmentation with SAM" learning path on Droplet. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.

Why is SAM's image encoder so much heavier than its mask decoder?

This is covered in the "Understand Image Segmentation with SAM" learning path on Droplet. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.

Can SAM run in real time, and what does it take to deploy it in production?

This is covered in the "Understand Image Segmentation with SAM" learning path on Droplet. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.

Back to library

✂️Understand Image Segmentation with SAM

Separate semantic, instance, and promptable segmentation so you can pick the right tool — then plan a tiny SAM-powered pipeline that crops product photos for an ecommerce catalog before you write a line of code.

Applied14 drops~2-week path · 5–8 min/daytechnology

Phase 1Three Flavors of Segmentation and Where Each Fails

Tell semantic, instance, and panoptic segmentation apart

4 drops

Segmentation isn't one task — it's three with very different bills
6 min
Segmentation isn't one task — it's three with very different bills
'Things' have edges, 'stuff' doesn't — and that breaks half your metrics
6 min
'Things' have edges, 'stuff' doesn't — and that breaks half your metrics
Every segmentation model fails — the question is how it fails
6 min
Every segmentation model fails — the question is how it fails
SAM isn't a feature — it's a tokenizer for images
7 min
SAM isn't a feature — it's a tokenizer for images

Phase 2Click-Prompt SAM with Points, Boxes, and Masks

Click-prompt SAM with points, boxes, and masks

5 drops

One click is a prompt — and SAM treats it like one
6 min
One click is a prompt — and SAM treats it like one
A bounding box is a stronger prompt than ten clicks
6 min
A bounding box is a stronger prompt than ten clicks
You can prompt SAM with another mask — and that's how refinement loops work
6 min
You can prompt SAM with another mask — and that's how refinement loops work
Text-to-mask isn't built into SAM — it's bolted on with CLIP
7 min
Text-to-mask isn't built into SAM — it's bolted on with CLIP
SAM gives you three masks when you ask for one — pick the right one
7 min
SAM gives you three masks when you ask for one — pick the right one

Phase 3Heavy Encoder, Light Decoder — and What That Means in Production

Trace SAM's heavy-encoder, light-decoder production tradeoff

4 drops

SAM's encoder is a ViT-H — and that's where the GPU money goes
7 min
SAM's encoder is a ViT-H — and that's where the GPU money goes
The 4M-parameter decoder is why SAM feels real-time
6 min
The 4M-parameter decoder is why SAM feels real-time
MobileSAM, FastSAM, EfficientSAM — pick by what you can give up
7 min
MobileSAM, FastSAM, EfficientSAM — pick by what you can give up
If you only need one mask shape, SAM is overkill
7 min
If you only need one mask shape, SAM is overkill

Phase 4Plan a SAM Pipeline for Ecommerce Product Photos

Plan a SAM pipeline that crops product photos

1 drop

Plan a SAM-powered cropper for product photos, end to end
22 min
Plan a SAM-powered cropper for product photos, end to end

Frequently asked questions

What's the difference between semantic, instance, and panoptic segmentation?: This is covered in the “Understand Image Segmentation with SAM” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
What is the Segment Anything Model (SAM) and why is it called a foundation model?: This is covered in the “Understand Image Segmentation with SAM” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
How do you prompt SAM — points, boxes, or text?: This is covered in the “Understand Image Segmentation with SAM” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
Why is SAM's image encoder so much heavier than its mask decoder?: This is covered in the “Understand Image Segmentation with SAM” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
Can SAM run in real time, and what does it take to deploy it in production?: This is covered in the “Understand Image Segmentation with SAM” learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.

✂️Understand Image Segmentation with SAM

Phase 1Three Flavors of Segmentation and Where Each Fails

Segmentation isn't one task — it's three with very different bills

'Things' have edges, 'stuff' doesn't — and that breaks half your metrics

Every segmentation model fails — the question is how it fails

SAM isn't a feature — it's a tokenizer for images

Phase 2Click-Prompt SAM with Points, Boxes, and Masks

One click is a prompt — and SAM treats it like one

A bounding box is a stronger prompt than ten clicks

You can prompt SAM with another mask — and that's how refinement loops work

Text-to-mask isn't built into SAM — it's bolted on with CLIP

SAM gives you three masks when you ask for one — pick the right one

Phase 3Heavy Encoder, Light Decoder — and What That Means in Production

SAM's encoder is a ViT-H — and that's where the GPU money goes

The 4M-parameter decoder is why SAM feels real-time

MobileSAM, FastSAM, EfficientSAM — pick by what you can give up

If you only need one mask shape, SAM is overkill

Phase 4Plan a SAM Pipeline for Ecommerce Product Photos

Plan a SAM-powered cropper for product photos, end to end

Frequently asked questions

🐍Python Decorators Introduction

🦀Rust Lifetimes Explained

☸️Kubernetes Core Concepts

📈Big O Intuition

Phase 1Three Flavors of Segmentation and Where Each Fails

Segmentation isn't one task — it's three with very different bills

'Things' have edges, 'stuff' doesn't — and that breaks half your metrics

Every segmentation model fails — the question is how it fails

SAM isn't a feature — it's a tokenizer for images

Phase 2Click-Prompt SAM with Points, Boxes, and Masks

One click is a prompt — and SAM treats it like one

A bounding box is a stronger prompt than ten clicks

You can prompt SAM with another mask — and that's how refinement loops work

Text-to-mask isn't built into SAM — it's bolted on with CLIP

SAM gives you three masks when you ask for one — pick the right one

Phase 3Heavy Encoder, Light Decoder — and What That Means in Production

SAM's encoder is a ViT-H — and that's where the GPU money goes

The 4M-parameter decoder is why SAM feels real-time

MobileSAM, FastSAM, EfficientSAM — pick by what you can give up

If you only need one mask shape, SAM is overkill

Phase 4Plan a SAM Pipeline for Ecommerce Product Photos

Plan a SAM-powered cropper for product photos, end to end

Frequently asked questions

Related paths

🐍Python Decorators Introduction

🦀Rust Lifetimes Explained

☸️Kubernetes Core Concepts

📈Big O Intuition