LIVE · SAT, JUN 27, 2026 --:--:-- ET
Issue Nº 67 COST TOTAL $14570.18 ARTICLES TODAY 4 TOKENS TOTAL 9.17B
aiexpert
Running the wire
Research Zhipu GLM 5.2 closes gap on Claude Opus 4.8; open-weight coding enters frontier tier Chips Cerebras and OpenAI sign $20B+ deal for 750MW high-speed AI inference capacity deployment Funding Mirendil raises $200M seed at $1B valuation: ex-Anthropic researchers build AI for AI R&D Market Tech mega-caps lose $2.7T in June as AI capex concerns mount Breaking Vercel launches Eve, open-source framework for building production AI agents Breaking Trump admin grants Anthropic export license for Mythos 5, ending 2-week standoff Funding Groq raises $650M, pivots to neocloud inference after Nvidia's $20B license deal Chips Apple releases container 1.0: native OCI runtime for Linux on Apple silicon, free alternative to Docker Desktop Breaking OpenAI launches GPT-5.6 series (Sol, Terra, Luna) under government preview; Sol at $5/$30 per million tokens Breaking Zhipu GLM 5.2 lands within percentage point of Anthropic Opus 4.8 at fifth the cost Funding Upscale AI hits $2B valuation with $190M Series A extension; Nvidia backs AI networking chip startup Funding Mirendil raises $200M seed at $1B to automate frontier AI research itself Funding General Intuition raises $320M at $2.3B to train agents on gameplay action data Funding Baseten closes $1.5B Series F at $13B valuation; AI inference consolidation Funding AppsFlyer raises $1B from Google, Meta, Unity; independent ad measurement bets on AI Market Oracle crashes 19% in worst week since 2001; $130B debt load triggers revaluation Funding Baseten closes $1.5B Series F at $13B valuation, 20x revenue growth Market Meta stock slides on capex concerns; $125–145B 2026 spend fails to move investors Chips GlobalPlatform launches Pavona: open-source silicon with production-grade post-quantum cryptography Breaking Vercel launches Eve, open-source agent framework with durable execution and sandboxing built-in Research Zhipu GLM 5.2 closes gap on Claude Opus 4.8; open-weight coding enters frontier tier Chips Cerebras and OpenAI sign $20B+ deal for 750MW high-speed AI inference capacity deployment Funding Mirendil raises $200M seed at $1B valuation: ex-Anthropic researchers build AI for AI R&D Market Tech mega-caps lose $2.7T in June as AI capex concerns mount Breaking Vercel launches Eve, open-source framework for building production AI agents Breaking Trump admin grants Anthropic export license for Mythos 5, ending 2-week standoff Funding Groq raises $650M, pivots to neocloud inference after Nvidia's $20B license deal Chips Apple releases container 1.0: native OCI runtime for Linux on Apple silicon, free alternative to Docker Desktop Breaking OpenAI launches GPT-5.6 series (Sol, Terra, Luna) under government preview; Sol at $5/$30 per million tokens Breaking Zhipu GLM 5.2 lands within percentage point of Anthropic Opus 4.8 at fifth the cost Funding Upscale AI hits $2B valuation with $190M Series A extension; Nvidia backs AI networking chip startup Funding Mirendil raises $200M seed at $1B to automate frontier AI research itself Funding General Intuition raises $320M at $2.3B to train agents on gameplay action data Funding Baseten closes $1.5B Series F at $13B valuation; AI inference consolidation Funding AppsFlyer raises $1B from Google, Meta, Unity; independent ad measurement bets on AI Market Oracle crashes 19% in worst week since 2001; $130B debt load triggers revaluation Funding Baseten closes $1.5B Series F at $13B valuation, 20x revenue growth Market Meta stock slides on capex concerns; $125–145B 2026 spend fails to move investors Chips GlobalPlatform launches Pavona: open-source silicon with production-grade post-quantum cryptography Breaking Vercel launches Eve, open-source agent framework with durable execution and sandboxing built-in
Chips

Cerebras and OpenAI sign $20B+ deal for 750MW high-speed AI inference capacity deployment

Cerebras Systems and OpenAI announced a multi-year agreement on June 23 for OpenAI to deploy 750 megawatts of Cerebras' wafer-scale inference compute over the next several years. The deal is valued at over $20 billion, with rollout starting in 2026. This is the largest high-speed AI inference deployment announced to date and reflects a strategic pivot toward dedicated low-latency inference silicon—different from the GPU-centric training infrastructure that has dominated AI capex.

<cite index="42-2">OpenAI states that "Cerebras adds a dedicated low-latency inference solution to our platform. That means faster responses, more natural interactions, and a stronger foundation to scale real-time AI to many more people."</cite> <cite index="44-2">Cerebras simultaneously launched a multi-year partnership with AWS that brings a disaggregated inference strategy: AWS's Trainium 3 chips perform the prefill, and Cerebras CS-3 runs blisteringly fast inference for decode.</cite> This two-provider approach underscores that OpenAI and AWS are decoupling token generation from context encoding.

<cite index="44-2">Cerebras co-launched Codex-Spark, a model designed for near-instant coding and optimized for interactive work where latency matters, delivering more than 1,000 tokens per second.</cite> <cite index="44-2">Kimi K2.6, the leading open-weight frontier model and the first trillion-parameter model served on Cerebras, achieved performance approaching 1,000 tokens per second as independently measured by Artificial Analysis.</cite> These benchmarks validate wafer-scale silicon for latency-sensitive agentic workloads.

For practitioners, this deal signals a strategic inversion in AI infrastructure: training was the scarce resource in 2023–2024; inference is now the constraint. <cite index="47-2">The 750MW deployment agreement is roughly 23 times the midpoint of Cerebras' full-year 2026 revenue guidance</cite>, giving the company contracted revenue clarity rare among hardware vendors. OpenAI's $20B+ commitment also validates that frontier-model providers will maintain dedicated inference tiers separate from hyperscaler commodity offerings. Expect additional fab capacity announcements from competitors (Groq, CoreWeave, others) and more hardware-software co-optimization announcements as inference speeds become a visible product differentiator for real-time AI agents.

Sources