The Week the AI Stack Got Repriced
Capital, contracts, and cost-per-token all moved at once — the AI stack enterprises buy from looks materially different than it did Friday.
Transcript
Four days. That's how long it took for OpenAI and Microsoft to tear up the exclusivity agreement that had structured the AI market since 2019, for Google to write a check of up to forty billion dollars into a direct competitor, and for the price of a GPT token to double at the exact moment the first systematic study shows that AI agents consume a thousand times more tokens than code chat. If you have an AI stack architecture diagram in your Confluence, it's already out of date. In this Wire: the end of Azure exclusivity, the capital that reconfigured Anthropic, the sovereign merger that emerged from Europe, the new agentic cost baseline, and the governance gaps regulators have yet to close.
What ties all these stories together is a common logic: the assumptions on which architecture teams built their 2026 plans — around pricing, vendors, and legal liability — all changed at the same time.
Start with the contract. OpenAI and Microsoft rewrote the agreement that had anchored the partnership since 2019. Three structural changes: end of API distribution exclusivity on Azure, removal of the AGI clause, and a reversal of the revenue flow.
The AGI clause was the most problematic. It gave Microsoft broad rights over OpenAI's intellectual property until the company self-declared it had achieved artificial general intelligence — an undefined definition, controlled by whoever issues the declaration, with perverse incentives baked into the structure.
Gone. In its place, Microsoft receives a non-exclusive license to OpenAI's models and products valid through 2032, with no technological contingency. The trigger for the renegotiation was concrete: OpenAI's plan to distribute products via AWS risked violating the Azure exclusivity terms. Sam Altman and Satya Nadella negotiated personally for several weeks.
The result: OpenAI can now distribute models and products through any cloud provider. Azure retains preferred partner status — products still launch there first — but the distribution monopoly is over.
On the financial side: Microsoft stopped paying OpenAI twenty percent of the revenue it generated by reselling models through Azure — that payment was eliminated. OpenAI still pays a revenue share to Microsoft, with a total cap and a 2030 expiration, at the same percentage as before. Microsoft's monetization shifts to equity: it profits from OpenAI's overall growth, not from API distribution margin. For the architect: multi-cloud with GPT models is no longer an open contractual question. Procurement teams that deferred AWS or Google Cloud evaluations due to Azure exclusivity risk can reopen those processes now.
Capital. Google will invest up to forty billion dollars in Anthropic — ten billion immediately, at a valuation of three hundred and fifty billion dollars, with up to thirty billion additional dollars contingent on performance milestones. That makes Alphabet the largest individual shareholder in the startup while competing directly with it at the model layer, via Gemini.
The resulting structure has no direct precedent in the industry. Google competes with Anthropic at the model layer, supplies the TPUs that run Claude inference, and now holds the largest individual financial position in the company. Model rival, infrastructure supplier, and largest backer — simultaneously.
The deal includes an additional commitment of five gigawatts of compute capacity on Google Cloud over five years — on top of an earlier agreement with Broadcom that a securities filing quantified at three point five gigawatts. Amazon added another five billion dollars to its own position in Anthropic that same week, within a broader agreement under which Anthropic is expected to commit up to one hundred billion dollars for approximately five gigawatts of compute capacity over time. Anthropic also closed a separate datacenter agreement with CoreWeave.
The context behind the urgency: Anthropic faced widespread user complaints about usage limits in recent weeks. The race for compute capacity is real, expensive, and accelerating. The company now has multi-gigawatt commitments with two of the three hyperscalers simultaneously.
For the architect: when a single hyperscaler can sell you the chip, supply the model competing with the one running on it, and hold the largest equity stake in the competing company — the exposure is no longer just cloud cost. It's vendor governance. "AI multi-vendor strategy" moves out of the preference column and into the procurement policy column.
The third move of the week came from outside Silicon Valley. Cohere, from Canada, and Aleph Alpha, from Germany, announced a merger to form an enterprise AI company valued at twenty billion dollars, according to the Financial Times. The deal is anchored by a six-hundred-million-dollar Series E from the Schwarz Group — Europe's largest retailer, operator of Lidl and Kaufland in thirty-two countries, and already an Aleph Alpha investor.
Aleph Alpha operates a government AI assistant with eighty thousand users in the German public sector. Its models were built from the ground up for European data residency and compliance with the EU AI Act — not retrofitted. Their tagline is "AI Made in Germany. For Europe" — and that is positioning and regulatory fact simultaneously.
The stated strategic rationale is straightforward: giving enterprises and governments an alternative to the American labs that currently dominate commercial AI, with greater independence and control over their data. The deal has not yet closed — it is subject to regulatory review, a non-trivial obstacle for a cross-border merger in a sector under market concentration scrutiny. The twenty-billion figure is FT reporting on a structure still under review.
But the Schwarz Group's six-hundred-million-dollar check is an operational signal, not a speculative one. When Europe's largest retailer writes a nine-figure sum to fund a sovereign AI alternative, that is not a portfolio hedge. That is a supply chain bet.
For teams with data residency constraints or exposure to the EU AI Act, the merger creates a single counterpart with full-stack infrastructure across North American and European jurisdictions, and a governance model designed for the AI Act from inception — not adapted to it afterward.
Now the cost layer. GPT-5.5 has arrived. When the API opens, pricing will be five dollars per million input tokens and thirty dollars per million output tokens — exactly double GPT-5.4, which stands at two dollars fifty and fifteen dollars. A GPT-5.5 Pro variant goes to thirty dollars input and one hundred eighty output. OpenAI positions GPT-5.4 as the capable, cost-efficient option; GPT-5.5 occupies the premium tier for workloads that justify the price.
A two-times increase on the base price would be absorbable if agent consumption patterns were linear. But this week saw the publication of the first systematic study of token consumption in agentic tasks — and the numbers change the calculation entirely.
The paper, from researchers at MIT, Stanford, the University of Michigan, and Salesforce AI Research, analyzed the trajectories of eight frontier LLMs — including GPT-5, Claude Sonnet 4.5, and Kimi-K2 — on the SWE-bench Verified benchmark. The mechanics behind the costs:
It is input tokens — not output tokens — that drive total cost. Agents continuously re-ingest long context windows in planning and error-recovery loops, with code generation itself representing a minimal fraction of spend. Consumption is also highly stochastic: on the same task, total token count can vary by up to thirty times across runs. And that variance does not correlate with outcome — accuracy peaks at intermediate cost levels and plateaus, or drops, at higher costs. Re-running the agent with more tokens is not a reliable solution.
The highest-leverage variable the study maps is model choice. Kimi-K2 and Claude Sonnet 4.5 consumed, on average, more than one point five million additional tokens compared to GPT-5 on the same tasks. For teams running hundreds of parallel sessions or CI pipelines, that difference compounds into material infrastructure cost — and the study provides an empirical basis for model selection decisions that go beyond accuracy.
The paper's central number: AI agents consume approximately one thousand times more tokens than conventional code chat.
And models cannot reliably predict their own consumption before execution. The maximum prediction correlation was zero point thirty-nine, with systematic underestimation. Agent self-budgeting estimates cannot be used as scheduling inputs without an empirical calibration layer.
Put the two stories together: GPT-5.5 at twice the base price, with agents consuming a thousand times more tokens than code chat. Enterprise agent budgets have just been repriced by orders of magnitude. Measure trajectories empirically, or budget blind.
Two governance moves close the week. The first is an opening. On April 27, OpenAI received FedRAMP 20x Moderate authorization for ChatGPT Enterprise and its API platform — unlocking federal procurement of frontier models at American civilian agencies. FedRAMP 20x is a streamlined pathway the GSA announced in March 2025, replacing manual documentation packages with cloud-native security evidence, Key Security Indicators, and automated validation.
GPT-5.5 is already available in the FedRAMP environment. Codex Cloud will also be accessible via ChatGPT Enterprise FedRAMP workspaces — for agencies with active software modernization programs, a Moderate-level authorized agentic coding environment removes a procurement barrier. For architects serving federal contractors: the reusable authorization package is in OpenAI's Trust Portal. Each agency retains authority to impose additional controls before production. FedRAMP Moderate covers controlled unclassified information — it does not extend to High or national security workloads.
The second governance move is a warning. A paper published April 24 on arXiv by Gauri Sharma and Maryam Molamohammadi maps a structural flaw in AI supply chains for hiring. Modern AI hiring systems operate across four layers — data suppliers, model developers, platform providers, and deployer organizations — and this architecture creates accountability gaps that neither the EU AI Act, nor NYC Local Law 144, nor the Colorado AI Act are structured to close.
The mechanics of the problem: a résumé parser may produce no measurable bias in isolation, but contribute to discriminatory outcomes when integrated with specific ranking algorithms and filtering thresholds. Each actor in the chain can demonstrate individual compliance. The integrated system can produce biased results regardless. And this is not theoretical — it is the standard architecture of any modern HR tech stack.
Under the EU AI Act, AI hiring tools sit at the highest risk level — and it is deployers, not upstream suppliers, who bear primary legal responsibility for fundamental rights impact assessments. If bias originates in the supplier's model or training data, the deploying organization remains the regulatory target, with no guaranteed right of technical access to diagnose the root cause. For legal and compliance teams negotiating AI vendor contracts: system-level audit rights, configuration disclosure, and shared liability clauses are not optional additions. They are the only mechanism that closes the gap current law leaves open.
That was this Monday's Wire at ai|expert. Capital, contracts, and cost per token all moved together this week — and the AI stack companies are buying looks materially different from last Friday's. In Friday's Edition, we go deep on the sovereignty thread: the DeepMind–South Korea pact, and what Canonical's Ubuntu AI roadmap means for on-prem. Until then.