Zero-downtime deploys checklist

One-page checklist for zero-downtime deploys.

Confirm service mesh/ingress supports weighted routing and is configured in both blue and green environments.
Validate Helm chart: readiness/liveness probes defined, config values stored in Git, image tag immutable, and chart version bumped.
Sync ArgoCD blue and green apps to the same baseline before introducing a new release.
Execute automated smoke, contract, and migration verification suites against green; block promotion on failures.
Annotate deploy in observability platform; pre-load dashboards for latency, error rate, and saturation.
Shift traffic following agreed weights (e.g., 10% → 30% → 60% → 100%) with hold times defined for each stage.
Monitor SLOs and business KPIs during each weight step; abort and roll back automatically if thresholds are exceeded.
Once green is stable at 100%, decommission or repurpose blue resources per cost policy while retaining snapshots/logs.
Capture deployment notes, metrics, and follow-up items in the release log; schedule improvement actions with owners.

Pitfalls

Forgetting to mirror feature flags, secrets, or third-party callbacks in green.
Lacking automated rollback scripts, forcing manual kubectl commands during stress.
Shipping incompatible database migrations that block blue from staying live.

Need help hardening zero-downtime pipelines? Book a working session via /contact.