WIRE Ep. 9 · May 11, 2026 · 7:48

The week compute became rental — and Claude entered regulated banking core

The week GPU shortage forced rivals to rent silicon from each other, while Claude agents land in regulated banking core processing thousands of institutions at scale.

Hosts: Alan · Ada EN

Transcript

JOHN

Two hundred twenty-two thousand NVIDIA GPUs.

MARIA

Elon Musk's SpaceX just rented them to Anthropic — a direct competitor.

JOHN

This is ai|expert Wire. The week compute became rental — and Claude entered regulated banking core.

JOHN

Colossus 1 was built to train xAI's Grok. Three hundred megawatts. H100, H200, and GB200 side by side. This week, those idle cycles began running Anthropic's Claude workloads. The first case of a frontier lab renting its primary training cluster to a direct rival. SpaceX itself was explicit: "the compute necessary to train and operate the next generation of these systems is outpacing what earthbound energy, physical space, and cooling can deliver in relevant timeframes."

MARIA

Elon Musk approved the deal personally after a meeting with Anthropic leadership. He wrote on X: "No one activated my badness detector." — a reversal from early this year, when he called Claude "misanthropic and malicious." The real signal is not the phrase. It is the immediate effect: Claude Code usage limits doubled for all paid tiers — Pro, Max, Team, and Enterprise — the same day. Compute purchased converts to available capacity in hours.

JOHN

The pattern matters more than the isolated deal. Anthropic has simultaneous capacity agreements with Amazon, Google, and Microsoft. No single source. When even rivals share silicon, single-vendor stack became capital risk — not merely technical risk.

MARIA

And the GPU-as-a-service market is confirming the pressure. CoreWeave published first-quarter 2026 results: revenue of $2.08 billion — more than double the $981.8 million from a year ago — above LSEG consensus of $1.97 billion. CEO Mike Intrator stated the company "reached hyperscaler scale." Ten customers already have individual commitments above one billion dollars.

JOHN

The revenue backlog is $99.4 billion.

MARIA

Capacity effectively pre-committed. CoreWeave revised its 2026 capex guidance to $31 to $35 billion, with plans to bring 1.7 gigawatts of power online by year-end, out of a total 3.5 gigawatts contracted. Anyone not in the queue for multi-year agreements will compete for increasingly scarce spot availability — at rising prices. The window to negotiate competitive hourly GPU rates is closing.

JOHN

On the hardware front, AMD launched the MI350P — a PCIe accelerator with 144 GB of HBM3E and 4 TB/s bandwidth. Theoretical peak metrics: 43% faster in FP16 and 39% faster in FP8 than NVIDIA's H200 NVL.

MARIA

The operational detail is the PCIe form factor. The card fits into existing air-cooled servers — no custom racks, no liquid-cooling contracts, no NVLink fabric. Up to eight cards per system. You scale inference incrementally, rather than buying eight GPUs at once as an indivisible block. For inference workloads where tokens per second per watt dominates the purchasing decision, that is the entry argument.

JOHN

The caveat persists: CUDA still dominates inference frameworks. AMD's ROCm is improving, but compatibility is not parity. Any team evaluating the MI350P needs to budget integration cycles that a NVIDIA deployment normally skips. Superior hardware on the benchmark does not guarantee workload gains if the software ecosystem does not keep pace.

MARIA

Lisa Su went further than the card. At AMD earnings, she revised the server CPU market growth forecast from 18% annually — the November projection — to over 35% annually, with total market of $120 billion by 2030. The argument: workloads are migrating from heavy GPU training to inference and CPU-intensive agentic execution. "Agents are generating tremendous demand across the entire AI adoption cycle," she told CNBC. Goldman Sachs upgraded AMD from hold to buy and raised the target from $240 to $450.

JOHN

The third vertex of the triangle is interconnect. NVIDIA closed a $3.2 billion warrant deal with Corning to build three new U.S. optical-fiber factories — North Carolina and Texas. Corning's optical capacity will be multiplied by ten. The objective: replace the five thousand copper cables inside Vera Rubin racks with co-packaged optical fiber.

MARIA

The physics: moving photons is between five and twenty times more energy-efficient than moving electrons, according to Corning CEO Wendell Weeks. But the strategic implication is starker: NVIDIA now controls the chip, network switching via Mellanox, and the interconnection medium. Anyone pricing multi-year AI infrastructure needs to put optical cabling as a primary cost line — and accept that price will be set by the same vendor selling the GPUs.

JOHN

Second segment. From what silicon cost to what it is doing. Regulated compliance, headcount, and production latency.

MARIA

Anthropic and FIS announced on May 4th an anti-money-laundering agent built on Claude. FIS processes transactions for thousands of financial institutions worldwide. The agent pulls complete dossiers from the bank's core systems, evaluates transactional activity against established typologies, prioritizes high-risk alerts, and drafts SARs — Suspicious Activity Reports. Investigations that took days now run in minutes.

JOHN

Financial institutions in the U.S. spend between $35 and $40 billion per year on AML operations alone. BMO and Amalgamated Bank are in active development. General availability to FIS's customer base is planned for the second half of 2026. For Anthropic, FIS is not a pilot at one bank — it is a wedge of distribution that achieves scale no direct bank-to-bank deal could replicate. Anthropic engineers are embedded within FIS to co-design the agent and transfer knowledge. This is a flagship relationship, not an API license.

MARIA

What distinguishes this architecture from any AI pilot in banking is data control. Customer data stays within FIS-controlled infrastructure. Every agent decision is traceable and auditable. This removes the primary argument that blocked AI adoption in banking over the past two years — that compliance would require separate data-residency agreements with the model vendor.

JOHN

FIS CEO Stephanie Ferris: "Every bank in the world wants AI that acts, not just observes."

MARIA

But there is an unresolved question that will determine adoption velocity at scale: whether regulators will accept "auditable agent decision" as equivalent to "documented human judgment." That signal has not yet come. It is the real risk of this deployment — not the technology itself.

JOHN

That same week, Cloudflare cut 1,100 employees — over 20% of global workforce. CEO Matthew Prince announced on the earnings call that agentic AI "fundamentally changed" the company's work. Internal AI usage grew over 600% in the previous three months.

MARIA

What distinguishes the Cloudflare case from conventional cost-cutting is that first-quarter revenue was $640 million — up 34% year-over-year, above consensus of $622 million. The annual guidance of $2.805 to $2.813 billion also exceeds consensus. The company did not cut because it is struggling. It cut because agents replaced work that previously required humans, while growth continued.

JOHN

The market responded with a 24% drop the next day — despite the quarter beating consensus. Investors have not yet closed consensus: is this structural productivity advantage or execution risk from paring teams too thin on product and support? For a CTO, the question is more objective. A cloud infrastructure company with nearly three billion in annual revenue is reporting headcount reduction as a present result — not as a projection of future ROI.

MARIA

The last move of this week is infrastructure. OpenAI launched in alpha WebSocket mode for the Responses API. Instead of each tool call and each reasoning step opening a complete new HTTP handshake, a single persistent connection sustains the entire agentic session.

JOHN

Numbers confirmed in production by early adopters: up to 40% latency reduction. Sustained throughput of one thousand transactions per second, with peaks of four thousand TPS. Vercel reported 40% improvement when integrating the mode into their AI SDK. Cline recorded 39% gains on multi-file workflows. Cursor reported gains up to 30%. These are transport-layer gains — independent of any model improvement.

MARIA

The caveat is architectural. WebSocket requires connection lifecycle management as a first-class concern: how long the connection stays open, how backpressure is handled in concurrency peaks, how to ensure resilience in distributed deployments. Teams that built purely stateless pipelines will need to rethink session management. The feature is compatible with Zero Data Retention — which removes the compliance objection for most enterprise cases. But it is still in alpha: the API surface may change before general availability.

JOHN

The synthesis: the AI bottleneck has shifted. It is no longer just model access. It is access to silicon, to interconnection, and to the infrastructure that makes the agent operate in real time — in compliance, at scale, with latency the business accepts.

JOHN

Compute became arbitrage asset. Agents entered banking compliance. And the infrastructure sustaining both is being redesigned at the same time. Friday on Edition: the study showing two thirds of votes on LLM leaderboards cancel out — and what that does to your model-selection matrix. Good week.

Transcript

Get the signal before the noise.