LLM Gateway: LiteLLM vs Portkey vs OpenRouter Pricing, Routing, and Lock-In (2026)

Blockers

! Per-log pricing scales linearly; Pro caps at 3M logs/month (exceeding stops logging, not requests); 30-day retention on Pro; enterprise required for 90+ days
! 5.5% credit purchase fee with no volume discount; credits expire after 1 year of non-use; crypto purchases never refundable; all traffic depends on OpenRouter availability
! Switch to self-hosted LiteLLM proxy with OpenAI-compatible format; replicate routing config in YAML — Lose managed observability UI, semantic caching, guardrail UI; must build or integrate equivalent logging (Langfuse, Helicone) — Pain: medium
! Switch to self-hosted LiteLLM with direct provider API keys; OpenAI-compatible format preserved — Must manage individual provider API keys; lose single-credit-balance convenience; gain full routing control — Pain: low

Who this is for

high-scale
cost-sensitive
enterprise
small-team
low-ops

Candidates

LiteLLM (open-source gateway)

Open-source Python SDK and proxy server (Apache 2.0) that translates requests to 100+ LLM providers using OpenAI-compatible input/output format. Self-hosted is free; enterprise adds SSO, RBAC, and team-level budgets at usage-based pricing (contact sales). As of 2026-03-13, the latest stable release is v1.81.14, with the proxy load-tested at 1,000 requests per second. SOC-2 Type 2 and ISO 27001 certified for enterprise deployments.

When to choose

Best for enterprise + high-scale or cost-sensitive + small-team environments that want full control over the routing layer, zero per-request platform fees, and the ability to self-host on existing infrastructure. Choose this when you need application-level load balancing, retry/fallback logic across multiple deployments (e.g., Azure/OpenAI), per-user rate limiting, and multi-tenant spend tracking — all without paying a gateway vendor.

Tradeoffs

Zero software cost for self-hosted. OpenAI-compatible format means existing client code works without changes. Supports routing, fallbacks, load balancing, spend tracking, rate limiting, and guardrails. Observability callbacks for Lunary, MLflow, Langfuse, Helicone, and custom logging. The tradeoff is operational responsibility: you manage the proxy infrastructure, database (PostgreSQL required for enterprise), uptime, and upgrades. Enterprise pricing is usage-based but not publicly listed — requires sales contact.

Cautions

Self-hosted means you own the proxy's availability — a proxy outage blocks all LLM traffic. Budget $100-$400/month for production infrastructure (managed database with backups, replication, HA). Enterprise SSO is free for up to 5 users only; beyond that requires enterprise license. Releases are frequent (every ~17 hours for nightly) — pin to stable releases and test before upgrading. The proxy adds a hop (1-5ms latency overhead).

Sources

Portkey (managed AI gateway)

Managed AI gateway supporting 250+ models across providers with built-in observability, cost tracking, guardrails, and semantic caching. Dev plan is free (10,000 logs/month, 30-day retention). Pro plan is custom-priced with $9 per 100K additional logs up to 3M/month. Enterprise starts at an estimated $2,000-$10,000+/month for 10M+ logs with VPC deployment. As of 2026-03-17, the platform processes 400B+ tokens for 200+ enterprises daily.

When to choose

Best for enterprise + compliance or small-team + low-ops environments that want managed observability, per-request cost visibility, and a policy UI without building infrastructure. Choose this when audit logs, budget limits, guardrails, and compliance certifications matter more than minimizing platform fees. Good when starting with few requests and scaling incrementally.

Tradeoffs

Zero infrastructure to manage — Portkey handles the gateway. Per-request cost visibility across all providers. Built-in semantic caching (Pro+) reduces redundant LLM calls. Guardrail violations logged alongside cost data. The tradeoff is per-log pricing that scales with volume: 500K requests costs ~$36/month on top of base, 1M requests ~$81, 2M ~$171. Free tier is only ~330 requests/day. No semantic caching on Dev tier.

Cautions

Per-log pricing means costs grow linearly with request volume — at high scale, self-hosted LiteLLM may be significantly cheaper. Pro tier caps at 3M logs/month; exceeding means logs stop being recorded, not that requests fail, but you lose observability. Log retention is only 30 days on Pro — enterprise required for 90+ days. Evaluate the PDP (Policy Decision Point) deployment model: sidecar mode minimizes latency but requires container infrastructure; cloud-only PDP adds network latency to every authorization check.

Sources

OpenRouter (credit-based model marketplace)

Credit-based LLM API marketplace providing access to 400+ models from 60+ providers through a single API. Zero markup on model inference pricing — providers are paid at their listed rates. Platform charges a 5.5% fee ($0.80 minimum) on credit purchases via Stripe. BYOK (Bring Your Own Key) mode: first threshold per month free, then 0.5% fee on subsequent usage. Credits denominated in USD, expire after one year of non-use.

When to choose

Best for small-team + cost-sensitive or low-ops environments that want instant access to the widest model catalog without provider-by-provider API key management. Choose this when model variety and rapid experimentation matter, when you want automatic fallback across providers without configuration, and when the 5.5% credit fee is acceptable versus managing multiple provider relationships.

Tradeoffs

Widest model catalog (400+ models, 60+ providers) through one API key. Automatic fallback: if a provider returns an error, OpenRouter falls back to the next provider transparently. Dynamic routing variants: :nitro (sort by throughput), :floor (sort by price), :exacto (sort by quality). Zero inference markup — you pay provider rates. The tradeoff is the 5.5% credit purchase fee, limited observability compared to Portkey/LiteLLM, and less granular routing control (no custom retry policies, no per-user budgets, no guardrails).

Cautions

The 5.5% credit fee applies to all Stripe purchases regardless of volume — no volume discounts. Credits expire after one year of non-use. BYOK mode reduces fees to 0.5% but means you manage API keys yourself (defeating a key benefit). Refunds available within 24 hours for unused credits but platform fees are non-refundable; crypto purchases are never refundable. OpenRouter is a passthrough — you depend on their availability for all LLM traffic with no self-hosted option. Rate limits on free models without credits are restrictive.

Sources

Facts updated: 2026-03-17

Published: 2026-04-03

How do I move off LLM Gateway without getting stuck?

Blockers

Who this is for

Candidates

LiteLLM (open-source gateway)

When to choose

Tradeoffs

Cautions

Sources

Portkey (managed AI gateway)

When to choose

Tradeoffs

Cautions

Sources

OpenRouter (credit-based model marketplace)

When to choose

Tradeoffs

Cautions

Sources

Try with your AI agent

How do I move off LLM Gateway without getting stuck?

Blockers

Who this is for

Candidates

LiteLLM (open-source gateway)

When to choose

Tradeoffs

Cautions

Sources

Portkey (managed AI gateway)

When to choose

Tradeoffs

Cautions

Sources

OpenRouter (credit-based model marketplace)

When to choose

Tradeoffs

Cautions

Sources

Related questions

Try with your AI agent