AI Changes CI/CD From Speed to Risk Control

Robert Erez, principal engineer at Octopus Deploy, discussed deployment decisions that break teams at scale in a recent episode of The Pragmatic Engineer with Gergely Orosz. He covered Kubernetes, GitOps, feature flags, and the shift in CI/CD economics when AI agents ship code.

**AI changes the CI/CD calculus from speed to risk.** For human developers, a ten-minute build slows productivity. For an AI agent without context-switching overhead, ten minutes is irrelevant. The real question is whether the agent ships a bug to production. Erez recommends front-loading thorough tests—even slower ones—rather than optimizing for wall-clock build time. Teams built around fast CI for human throughput need to recalibrate.

On stateful services, Erez is unambiguous: roll forward, never backward. When a deployment changes a database schema, rolling back v2 to v1 leaves application code mismatched to the schema. The fix is v3 with the correction—critical for AI inference stacks backed by vector stores, fine-tune metadata tables, or any database-coupled memory system. Rollback is unsafe once state is involved.

For incident response, feature flags beat rollbacks. Toggling a feature off stops damage without redeploying at 2 a.m. The cost: flags accumulate. Erez likens cleanup to gardening—unearthed toggles turn the codebase into a conditional-logic maze. Teams running many AI feature experiments should treat flag lifecycle as a first-class task.

GitOps hits a concrete ceiling at scale. Erez describes organizations running thousands of independent Kubernetes clusters pulling state from a single Git repository. The repo becomes the bottleneck—clusters throttle and teams resort to workarounds. The four pillars of GitOps (declarative, versioned and immutable, pull-based, continuously reconciled) don't require Git, but the industry conflates the term with "put everything in a repo"—including secrets that don't belong.

Ephemeral environments have replaced static staging for teams executing this well. Spin up a full-stack environment per feature branch, evaluate pre-merge, tear it down on merge. For AI teams, this is the natural evaluation harness—run the agent against a live ephemeral environment instead of mocked unit tests. Erez doesn't prescribe tooling; the operational shift is what matters.

Erez separates continuous deployment from continuous delivery deliberately. Continuous deployment pushes every commit to production automatically. Continuous delivery means every commit is shippable but the final push is optional—automate it or click a button once a week. Most teams don't need continuous deployment and get more value from validating the deployment process through continuous delivery.

If your CI/CD pipeline was tuned for human developer throughput, the AI-agent era demands a different target. Test coverage depth and deployment risk controls—feature flags, roll-forward discipline, ephemeral eval environments—outrank build speed as the primary levers.

Sources

AI shifts the CI/CD calculus from speed to risk — when AI agents write most code and babysit pipelines without context switching, test thoroughness outranks build speed
"AI shifts the CI/CD calculus from speed to risk. Today, shaving ten minutes off the CI build-time matters because a long-running build blocks human devs. But this time saving will be insignificant when an AI agent writes most of the code and 'babysits' a slow pipeline without context switching."
newsletter.pragmaticengineer.com ↗
Roll forward, never backwards — rolling back v2 to v1 leaves code talking to a schema no longer in sync for stateful systems
"When a system has state – which typically means it uses databases – then doing a rollback can leave the code talking to a schema that's no longer in sync. Rob's advice is to not treat a failure in v2 as a trip back to v1, but rather as a push to v3 with the fix in it."
newsletter.pragmaticengineer.com ↗
Feature toggles are a better safety net than rollbacks — toggling a feature off stops the bleeding without a forced redeployment
"When something breaks in production, reaching for a toggle to switch a feature off enables you to 'stop the bleeding' and then calmly diagnose an issue."
newsletter.pragmaticengineer.com ↗
Feature flags accumulate into a hygiene crisis if rolled-out toggles aren't removed — treat cleanup like gardening
"The ease with which feature flags are added can create a hygiene crisis if they're continuously added, but not removed. Treat feature-toggle cleanups like a form of gardening and 'weed' rolled-out toggles from the codebase."
newsletter.pragmaticengineer.com ↗
A Git repo becomes a bottleneck at scale — some companies run thousands of independent Kubernetes clusters pulling state from a single Git repository
"Rob mentions that some companies run thousands of independent Kubernetes clusters that pull state from a Git repository. But such clusters can get throttled by the repo, forcing them into workarounds. Pull-based GitOps doesn't scale infinitely for free."
newsletter.pragmaticengineer.com ↗
GitOps's four pillars — declarative, versioned and immutable, pull-based, continuously reconciled — don't actually require Git
"None of the four pillars of GitOps – 1) declarative, 2) versioned and immutable, 3) pulled, not pushed, 4) continuously reconciled – require Git, although Git can work under these constraints."
newsletter.pragmaticengineer.com ↗
Ephemeral environments per feature branch are replacing static test and staging environments for pre-merge evaluation
"Today, it's trivial to spin up a full environment, per-feature branch, pre-merge. This is an 'ephemeral' environment for evaluating that things work, which is then torn down once something is merged."
newsletter.pragmaticengineer.com ↗
Continuous deployment (every commit to prod) is overkill for most teams; continuous delivery with optional final push is more practical
"Shipping every single change to prod (continuous deployment) is not as necessary as many people think, Rob says, and there's often more value in continuous delivery, where changes flow through testing and the deployment process itself is validated."
newsletter.pragmaticengineer.com ↗

Written and edited by AI agents · Methodology

AI Changes CI/CD From Speed to Risk Control

Get the signal before the noise.

Get the signal before the noise.