How do I move off LLM Gateway without getting stuck?

Choosing an LLM gateway/router where LiteLLM (open-source, self-hosted), Portkey (managed, per-log pricing), and OpenRouter (credit-based marketplace) have fundamentally different pricing structures, routing capabilities, observability depth, and lock-in profiles — and the wrong choice locks you into a billing model that scales poorly or a routing layer that becomes a single point of failure.

LiteLLM if you expect scale or care about cost/control; choose Portkey only when managed compliance and audit UX matter more than avoiding gateway fees.

Blockers

Who this is for

Candidates

LiteLLM (open-source gateway)

Open-source Python SDK and proxy server (Apache 2.0) that translates requests to 100+ LLM providers using OpenAI-compatible input/output format. Self-hosted is free; enterprise adds SSO, RBAC, and team-level budgets at usage-based pricing (contact sales). As of 2026-03-13, the latest stable release is v1.81.14, with the proxy load-tested at 1,000 requests per second. SOC-2 Type 2 and ISO 27001 certified for enterprise deployments.

When to choose

Best for enterprise + high-scale or cost-sensitive + small-team environments that want full control over the routing layer, zero per-request platform fees, and the ability to self-host on existing infrastructure. Choose this when you need application-level load balancing, retry/fallback logic across multiple deployments (e.g., Azure/OpenAI), per-user rate limiting, and multi-tenant spend tracking — all without paying a gateway vendor.

Tradeoffs

Zero software cost for self-hosted. OpenAI-compatible format means existing client code works without changes. Supports routing, fallbacks, load balancing, spend tracking, rate limiting, and guardrails. Observability callbacks for Lunary, MLflow, Langfuse, Helicone, and custom logging. The tradeoff is operational responsibility: you manage the proxy infrastructure, database (PostgreSQL required for enterprise), uptime, and upgrades. Enterprise pricing is usage-based but not publicly listed — requires sales contact.

Cautions

Self-hosted means you own the proxy's availability — a proxy outage blocks all LLM traffic. Budget $100-$400/month for production infrastructure (managed database with backups, replication, HA). Enterprise SSO is free for up to 5 users only; beyond that requires enterprise license. Releases are frequent (every ~17 hours for nightly) — pin to stable releases and test before upgrading. The proxy adds a hop (1-5ms latency overhead).

Portkey (managed AI gateway)

Managed AI gateway supporting 250+ models across providers with built-in observability, cost tracking, guardrails, and semantic caching. Dev plan is free (10,000 logs/month, 30-day retention). Pro plan is custom-priced with $9 per 100K additional logs up to 3M/month. Enterprise starts at an estimated $2,000-$10,000+/month for 10M+ logs with VPC deployment. As of 2026-03-17, the platform processes 400B+ tokens for 200+ enterprises daily.

When to choose

Best for enterprise + compliance or small-team + low-ops environments that want managed observability, per-request cost visibility, and a policy UI without building infrastructure. Choose this when audit logs, budget limits, guardrails, and compliance certifications matter more than minimizing platform fees. Good when starting with few requests and scaling incrementally.

Tradeoffs

Zero infrastructure to manage — Portkey handles the gateway. Per-request cost visibility across all providers. Built-in semantic caching (Pro+) reduces redundant LLM calls. Guardrail violations logged alongside cost data. The tradeoff is per-log pricing that scales with volume: 500K requests costs ~$36/month on top of base, 1M requests ~$81, 2M ~$171. Free tier is only ~330 requests/day. No semantic caching on Dev tier.

Cautions

Per-log pricing means costs grow linearly with request volume — at high scale, self-hosted LiteLLM may be significantly cheaper. Pro tier caps at 3M logs/month; exceeding means logs stop being recorded, not that requests fail, but you lose observability. Log retention is only 30 days on Pro — enterprise required for 90+ days. Evaluate the PDP (Policy Decision Point) deployment model: sidecar mode minimizes latency but requires container infrastructure; cloud-only PDP adds network latency to every authorization check.

OpenRouter (credit-based model marketplace)

Credit-based LLM API marketplace providing access to 400+ models from 60+ providers through a single API. Zero markup on model inference pricing — providers are paid at their listed rates. Platform charges a 5.5% fee ($0.80 minimum) on credit purchases via Stripe. BYOK (Bring Your Own Key) mode: first threshold per month free, then 0.5% fee on subsequent usage. Credits denominated in USD, expire after one year of non-use.

When to choose

Best for small-team + cost-sensitive or low-ops environments that want instant access to the widest model catalog without provider-by-provider API key management. Choose this when model variety and rapid experimentation matter, when you want automatic fallback across providers without configuration, and when the 5.5% credit fee is acceptable versus managing multiple provider relationships.

Tradeoffs

Widest model catalog (400+ models, 60+ providers) through one API key. Automatic fallback: if a provider returns an error, OpenRouter falls back to the next provider transparently. Dynamic routing variants: :nitro (sort by throughput), :floor (sort by price), :exacto (sort by quality). Zero inference markup — you pay provider rates. The tradeoff is the 5.5% credit purchase fee, limited observability compared to Portkey/LiteLLM, and less granular routing control (no custom retry policies, no per-user budgets, no guardrails).

Cautions

The 5.5% credit fee applies to all Stripe purchases regardless of volume — no volume discounts. Credits expire after one year of non-use. BYOK mode reduces fees to 0.5% but means you manage API keys yourself (defeating a key benefit). Refunds available within 24 hours for unused credits but platform fees are non-refundable; crypto purchases are never refundable. OpenRouter is a passthrough — you depend on their availability for all LLM traffic with no self-hosted option. Rate limits on free models without credits are restrictive.

Facts updated: 2026-03-17
Published: 2026-04-03

Try with your AI agent

$ npm install -g pocketlantern
$ pocketlantern init
# Restart Claude Code, Cursor, or your MCP client, then ask:
# "How do I move off LLM Gateway without getting stuck?"
Missing something? Request coverage