Zero-Downtime Deployment Playbook

Blockers

! Lock-in via framework/kubernetes
! requires_version: capability/rolling-deployment → capability/backward-compatible-api-schema
! requires_version: capability/rolling-deployment → capability/backward-compatible-db-migrations
! Lock-in via capability/readiness-probes
! requires_version: capability/blue-green-deployment → capability/double-infrastructure
! requires_version: capability/blue-green-deployment → capability/database-handles-both-versions
! Lock-in via capability/connection-draining
! requires_version: capability/canary-deployment → capability/traffic-splitting-infrastructure
! requires_version: capability/canary-deployment → capability/observability
! Lock-in via capability/rollback-criteria
! Lock-in via capability/representative-traffic

Who this is for

high-scale
enterprise
real-time

Candidates

Rolling Deployment

Replace instances one by one. New version gradually takes over while old instances still serve traffic.

When to choose

Standard web applications, Kubernetes default strategy. Best fit for low-ops + small-team or microservices + enterprise where simplicity and native orchestrator support are priorities.

Tradeoffs

Simple and well-supported. During rollout, both old and new versions coexist — API/schema must be backward compatible.

Cautions

Database migrations must be backward compatible. Use readiness probes to avoid routing to unready instances.

Sources

kubernetes.io/docs/concepts/workloads/controllers/deployment/

Blue-Green Deployment

Run two identical environments (blue=current, green=new). Switch traffic all at once after green is verified.

When to choose

When you need instant rollback and can afford double infrastructure during deploy. Best fit for enterprise + compliance or real-time + high-scale where guaranteed rollback and zero-gap cutover are required.

Tradeoffs

Clean cutover, instant rollback by switching back. Requires 2x infrastructure during deployment. Database must handle both versions.

Cautions

Long-running requests on blue may be dropped at switchover. Use connection draining.

Sources

docs.aws.amazon.com/whitepapers/latest/overview-deployment-options/bluegreen-deployments.html

Canary Deployment

Route a small percentage of traffic to the new version first. Gradually increase if metrics are healthy.

When to choose

High-traffic services where you want to detect issues before full rollout. Best fit for high-scale + microservices or enterprise + real-time where progressive rollout reduces blast radius.

Tradeoffs

Minimizes blast radius of bad deploys. Requires traffic splitting infrastructure and good observability to detect issues.

Cautions

Define clear rollback criteria (error rate, latency). Ensure canary gets representative traffic, not just internal.

Sources

martinfowler.com/bliki/CanaryRelease.html

Facts updated: 2026-03-14

Published: 2026-03-29

Rolling, blue-green, or canary — which zero-downtime strategy?

Blockers

Who this is for

Candidates

Rolling Deployment

When to choose

Tradeoffs

Cautions

Sources

Blue-Green Deployment

When to choose

Tradeoffs

Cautions

Sources

Canary Deployment

When to choose

Tradeoffs

Cautions

Sources

Try with your AI agent

Rolling, blue-green, or canary — which zero-downtime strategy?

Blockers

Who this is for

Candidates

Rolling Deployment

When to choose

Tradeoffs

Cautions

Sources

Blue-Green Deployment

When to choose

Tradeoffs

Cautions

Sources

Canary Deployment

When to choose

Tradeoffs

Cautions

Sources

Related questions

Try with your AI agent