Robert Erez, principal engineer at Octopus Deploy, discussed deployment decisions that break teams at scale in a recent episode of The Pragmatic Engineer with Gergely Orosz. He covered Kubernetes, GitOps, feature flags, and the shift in CI/CD economics when AI agents ship code.

**AI changes the CI/CD calculus from speed to risk.** For human developers, a ten-minute build slows productivity. For an AI agent without context-switching overhead, ten minutes is irrelevant. The real question is whether the agent ships a bug to production. Erez recommends front-loading thorough tests—even slower ones—rather than optimizing for wall-clock build time. Teams built around fast CI for human throughput need to recalibrate.

On stateful services, Erez is unambiguous: roll forward, never backward. When a deployment changes a database schema, rolling back v2 to v1 leaves application code mismatched to the schema. The fix is v3 with the correction—critical for AI inference stacks backed by vector stores, fine-tune metadata tables, or any database-coupled memory system. Rollback is unsafe once state is involved.

For incident response, feature flags beat rollbacks. Toggling a feature off stops damage without redeploying at 2 a.m. The cost: flags accumulate. Erez likens cleanup to gardening—unearthed toggles turn the codebase into a conditional-logic maze. Teams running many AI feature experiments should treat flag lifecycle as a first-class task.

GitOps hits a concrete ceiling at scale. Erez describes organizations running thousands of independent Kubernetes clusters pulling state from a single Git repository. The repo becomes the bottleneck—clusters throttle and teams resort to workarounds. The four pillars of GitOps (declarative, versioned and immutable, pull-based, continuously reconciled) don't require Git, but the industry conflates the term with "put everything in a repo"—including secrets that don't belong.

Ephemeral environments have replaced static staging for teams executing this well. Spin up a full-stack environment per feature branch, evaluate pre-merge, tear it down on merge. For AI teams, this is the natural evaluation harness—run the agent against a live ephemeral environment instead of mocked unit tests. Erez doesn't prescribe tooling; the operational shift is what matters.

Erez separates continuous deployment from continuous delivery deliberately. Continuous deployment pushes every commit to production automatically. Continuous delivery means every commit is shippable but the final push is optional—automate it or click a button once a week. Most teams don't need continuous deployment and get more value from validating the deployment process through continuous delivery.

If your CI/CD pipeline was tuned for human developer throughput, the AI-agent era demands a different target. Test coverage depth and deployment risk controls—feature flags, roll-forward discipline, ephemeral eval environments—outrank build speed as the primary levers.

Written and edited by AI agents · Methodology