π¦Use AI Gateways: OpenRouter, Portkey, Helicone
Stop choosing a gateway because a blog post said so. By the end you can pick OpenRouter, Portkey, Helicone, or self-host for a real multi-region app and defend it on failover, cost, and observability.
Phase 1What an AI Gateway Actually Solves
See the three problems a gateway actually solves
The three problems a gateway actually solves
6 minAn AI gateway isn't a router β it's three products in one trench coat: failover, cost tracking, and observability. Most teams discover they needed the second two only after an outage forced them to discover the first.
When you don't need a gateway
5 minIf you have one provider, low volume, and no compliance asks, a gateway is overhead β not insurance. The right call is to call the provider SDK directly and revisit when one of the three problems actually appears.
How a gateway sits in the request path
7 minA gateway is just an HTTP proxy with an LLM-shaped API. Your app calls the gateway like it called OpenAI; the gateway calls the actual provider and returns the response. Everything else β retries, caching, logging β is what it does in between.
The failover-cost-observability triangle
6 minEvery gateway optimizes for one corner of the triangle: OpenRouter for failover, Portkey for cost+governance, Helicone for observability. Pick by which corner is most painful, not which vendor pitched best.
Phase 2Wiring OpenRouter Into One Endpoint
Wire OpenRouter into one endpoint with a fallback
OpenRouter in 10 lines: drop-in OpenAI replacement
8 minOpenRouter speaks the OpenAI Chat Completions API. Swapping in OpenRouter is literally a base URL and an API key change β your existing SDK code keeps working unchanged.
Configure your first fallback chain
8 minOpenRouter lets you declare a fallback chain in the request itself: `models: ['anthropic/claude-3.5-sonnet', 'openai/gpt-4o']`. If the first fails or rate-limits, it tries the second automatically β no app-side retry logic.
Force a failure and watch failover behave
8 minThe only way to trust failover is to break the primary on purpose. Set the primary to an invalid model name or an expired key, keep the fallback valid, and confirm the request still succeeds β that's the gateway earning its keep.
See per-request cost in the gateway dashboard
7 minOpenRouter's dashboard surfaces per-request cost in dollars (not tokens) within seconds. The instant you route through it, your finance question 'what does this feature cost?' becomes answerable per endpoint, per user, per day.
Wire gateway env vars into your real app
9 minThe migration from scratch file to real app is one environment variable per provider. Keep the original OpenAI key as a fallback path (or remove it entirely) β the gateway becomes the single point of LLM access for your app.
Phase 3Choosing Among Real Gateway Options
Compare OpenRouter, Portkey, Helicone, and self-host
OpenRouter β when 'simplest path to multi-provider' wins
8 minOpenRouter is the right pick when your top problem is 'I want failover across providers without managing multiple SDKs.' It's the simplest gateway by a wide margin β and that simplicity is the feature.
Portkey β when governance and policy enforcement matter
9 minPortkey is the gateway for teams whose top problem is 'we have 8 engineers calling LLMs, we need budgets, role-based access, prompt registries, and audit logs β not just routing.' It's a control plane, not a router.
Helicone β when observability is the headline problem
8 minHelicone treats every LLM call as a data point: logged, indexed, replayable, diffable. It's the gateway for teams whose top problem is 'production is breaking and we can't tell which prompts caused what outputs.'
Self-hosted gateways β LiteLLM and when DIY pays off
9 minSelf-hosted gateways like LiteLLM (and Kong AI, Cloudflare AI Gateway) give you full control of the proxy, no third-party vendor in the request path, and zero per-call SaaS cost β at the price of operating another service.
Phase 4Defending a Gateway Choice at Scale
Defend a gateway choice for a 5-provider multi-region app
Pick a gateway for a 5-provider multi-region app
22 minA gateway decision at this scale is a one-page memo: which corner of the triangle dominates, which gateway optimizes for it, what the alternatives are, and what the failure mode is if you're wrong. Writing the memo IS the work.
Frequently asked questions
- What is an AI gateway and how is it different from an LLM router?
- This is covered in the βUse AI Gateways: OpenRouter, Portkey, Heliconeβ learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- When do I actually need a gateway versus just calling OpenAI directly?
- This is covered in the βUse AI Gateways: OpenRouter, Portkey, Heliconeβ learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- OpenRouter vs Portkey vs Helicone β which one should I pick?
- This is covered in the βUse AI Gateways: OpenRouter, Portkey, Heliconeβ learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- Does an AI gateway add latency to every LLM call?
- This is covered in the βUse AI Gateways: OpenRouter, Portkey, Heliconeβ learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
- Can I self-host an AI gateway, and is it worth the operational cost?
- This is covered in the βUse AI Gateways: OpenRouter, Portkey, Heliconeβ learning path. Start with daily 5-minute micro-lessons that build from fundamentals to hands-on application.
Related paths
πPython Decorators Introduction
Build one mental model for Python decorators that covers closures, argument passing, functools.wraps, and stacking β then ship a working caching or logging decorator from scratch in under 30 lines.
π¦Rust Lifetimes Explained
Stop reading `'a` as line noise and start reading it as scope arithmetic β one failing snippet at a time β until you can thread lifetimes through a small parser or iterator adapter without fighting the borrow checker.
βΈοΈKubernetes Core Concepts
Stop drowning in 30+ resource types. Build the mental model one primitive at a time -- pods, deployments, services, ingress, config -- then deploy a real app with rolling updates and health checks.
πBig O Intuition
Stop treating Big O as math you memorized for an interview β build the intuition to spot O(nΒ²) disasters, pick the right data structure without thinking, and rewrite a slow function from O(nΒ²) to O(n) in under five minutes.