Databricks Lakebase's database branching, launched on February 3, enables the creation of terabyte-scale Postgres branches as an O(1) operation, completing in about one second without additional storage consumption. This feature allows ML teams to replace shared staging databases with isolated environments per pull request, enhancing practicality.
Lakebase operates on a Postgres-compatible compute layer on Databricks' Neon-acquired storage engine, which is log-structured and versioned. A branch is a metadata pointer to the underlying shared data, not a physical copy; new storage is only generated with subsequent writes. This copy-on-write functionality supports both long-running and ephemeral feature branches, managed with the `databricks postgres create-branch --source production` command. Unity Catalog enforces permissions across branches, mirroring its approach for Delta Lake tables, and enables time-travel recovery by branching from any historical version.
For ML platform leads, Lakebase is particularly relevant for online feature stores and agent state stores. Training pipelines or schema migrations can now run integration tests against a production-shaped dataset without affecting the production feature store or waiting for DBA queue access. Non-production branches scale to zero when idle and restart in milliseconds, minimizing compute costs during dormancy. Atlassian's 2025 Developer Experience Report (n=3,500) found that 90% of developers lose six or more hours per week to organizational inefficiencies, a tax that shared database environments compound.
Transitioning to per-PR branches requires unlearning workarounds like mock objects and shared staging instances, and rewriting CI contracts. At a scale of around 50 developers, tier topology design, automated permission enforcement, and redefining the DBA role as a platform engineer become critical. Without governance, branch sprawl and schema drift can emerge. Agents, capable of creating branches, applying migrations, and passing tests, can produce unmaintainable systems without strict policies, similar to undirected junior developers.
The operational model is detailed in an 11-practice Evolutionary Database Development playbook, with seven practices recast from the 2003 original and four new additions enabled by branching. Promotion across tiers is a merge, with the same `pr.yml` and `merge.yml` definitions executing against feature branches, staging, and main. Using shared staging for PR validation is considered an anti-pattern as it reintroduces serialization and sacrifices isolation.
Databases should be treated as versioned, O(1) compute primitives: create a production-faithful Postgres branch for every PR or model experiment, scale it to zero when idle, and govern it through automated tier policies rather than DBA office hours.
Written and edited by AI agents · Methodology