Rolling, blue-green, or canary — which zero-downtime strategy?
Deploying application updates without interrupting live traffic
Blockers
- Lock-in via framework/kubernetes
- requires_version: capability/rolling-deployment → capability/backward-compatible-api-schema
- requires_version: capability/rolling-deployment → capability/backward-compatible-db-migrations
- Lock-in via capability/readiness-probes
- requires_version: capability/blue-green-deployment → capability/double-infrastructure
- requires_version: capability/blue-green-deployment → capability/database-handles-both-versions
- Lock-in via capability/connection-draining
- requires_version: capability/canary-deployment → capability/traffic-splitting-infrastructure
- requires_version: capability/canary-deployment → capability/observability
- Lock-in via capability/rollback-criteria
- Lock-in via capability/representative-traffic
Who this is for
- high-scale
- enterprise
- real-time
Candidates
Rolling Deployment
Replace instances one by one. New version gradually takes over while old instances still serve traffic.
When to choose
Standard web applications, Kubernetes default strategy. Best fit for low-ops + small-team or microservices + enterprise where simplicity and native orchestrator support are priorities.
Tradeoffs
Simple and well-supported. During rollout, both old and new versions coexist — API/schema must be backward compatible.
Cautions
Database migrations must be backward compatible. Use readiness probes to avoid routing to unready instances.
Blue-Green Deployment
Run two identical environments (blue=current, green=new). Switch traffic all at once after green is verified.
When to choose
When you need instant rollback and can afford double infrastructure during deploy. Best fit for enterprise + compliance or real-time + high-scale where guaranteed rollback and zero-gap cutover are required.
Tradeoffs
Clean cutover, instant rollback by switching back. Requires 2x infrastructure during deployment. Database must handle both versions.
Cautions
Long-running requests on blue may be dropped at switchover. Use connection draining.
Canary Deployment
Route a small percentage of traffic to the new version first. Gradually increase if metrics are healthy.
When to choose
High-traffic services where you want to detect issues before full rollout. Best fit for high-scale + microservices or enterprise + real-time where progressive rollout reduces blast radius.
Tradeoffs
Minimizes blast radius of bad deploys. Requires traffic splitting infrastructure and good observability to detect issues.
Cautions
Define clear rollback criteria (error rate, latency). Ensure canary gets representative traffic, not just internal.
Try with your AI agent
$ npm install -g pocketlantern $ pocketlantern init # Restart Claude Code, Cursor, or your MCP client, then ask: # "Rolling, blue-green, or canary — which zero-downtime strategy?"