AI chip bottleneck has shifted: CoWoS packaging now the binding constraint through 2026
The binding constraint on AI compute buildout has shifted from wafer supply to advanced packaging. Chip-on-Wafer-on-Substrate (CoWoS) — the process that co-packages High-Bandwidth Memory (HBM) with AI accelerators at the density modern workloads require — is now TSMC's primary bottleneck. CEO C.C. Wei stated publicly that CoWoS capacity is 'sold out through 2025 and into 2026,' with TrendForce projecting roughly 120,000–130,000 monthly wafers by end of 2026, up from 75,000 in 2025. However, analysts note the expansion is unlikely to close demand. NVIDIA reportedly secured over 70% of TSMC's CoWoS-L capacity, leaving remaining allocation split among AMD, Broadcom, Marvell, and others — creating a structural constraint that compounds with HBM3E supply tightness. Without CoWoS capacity, a wafer is not a finished AI accelerator; it is silicon waiting for a process more constrained than the silicon itself.
The bottleneck is further compressed by geopolitical export controls. In March 2026, regulatory uncertainty over H200 sales to China forced NVIDIA to redirect TSMC capacity from H200 production to next-generation Vera Rubin chips with confirmed US orders from OpenAI, Google, and other American firms. Less-advanced AI chips like the H200 consume the same constrained CoWoS and HBM capacity as frontier chips, creating direct competition. OpenAI CEO Sam Altman put it plainly: 'The bottleneck goes back and forth. Right now, again, it's chips.' Hyperscalers are responding by developing custom silicon (Google TPUs, AWS Trainium, Meta Maia), but this accelerates market fragmentation rather than solving the underlying supply shortage.
For architects planning 2026–2027 capacity, CoWoS allocation is now the ultimate constraint, not compute demand or capital. Long-term buyers (Microsoft, Google, Amazon, Meta) are securing multi-year allocations, leaving smaller players and startups in a queued backlog. Even power and grid capacity — the previous binding constraint — are less scarce than packaging slots. Design choices between optimized-for-NVIDIA vs. custom-silicon now carry supply-chain-driven ROI implications that dwarf traditional performance metrics. The structural supply tightness is expected to persist through 2027–2028 as 30–50% of planned 2026 data center capacity has slipped to 2028 due to power grid interconnection queues, extending component demand pressure across subsequent years.