SK Hynix aims to double its memory wafer capacity over the next five years, yet Chairman Chey Tae-won expects the AI-driven memory shortage to persist until 2030. The delay in bringing greenfield fabs online, which takes over five years, contributes to this structural gap. One gigabyte of HBM consumes four times the wafer capacity of standard DRAM, and TrendForce projects AI workloads will account for nearly 20% of global DRAM wafer output by 2026.
As a major supplier, SK Hynix, which accounts for approximately 57% of global HBM and 32% of DRAM, plans to increase its 2026 capital expenditure above the 30.2 trillion won (approximately $20 billion) spent in 2025. Customers are offering to buy the company's EUV scanners and prefund new fab lines due to effectively zero available capacity. With HBM4 entering mass production this year on a 2 TB/s interface and demand growing over 130% year-over-year in 2025, suppliers are reallocating conventional DRAM fabs toward AI memory, further tightening the general-purpose supply.
TrendForce forecasts DRAM contract prices to rise 58–63% quarter-over-quarter in Q2 2026, following a roughly 95% climb in Q1. This increase impacts hardware budgets, as memory comprises 20–30% of server BOM, leading to 15–25% higher server costs due to the DRAM spike. OVH Cloud CEO Octave Klaba predicts a 5–10% cloud price increase between April and September 2026 as these costs pass through, with managed databases, caching layers, and inference instances facing disproportionate exposure. The revelation in October 2025 that OpenAI had signed deals for up to 900,000 DRAM wafers per month—about 40% of global output—for its Stargate Project triggered competitor stockpiling, sending DDR4 prices up 158% and DDR5 up 307% within three months.
For production architects, the issue is allocation, not just unit cost. With SK Hynix's new capacity arriving near 2030—the same year shortages are predicted to ease—there is no near-term relief. TrendForce notes enterprise SSD shortages will likely continue until late 2027 or 2028, and TrendForce reports smartphone brands are adjusting production plans starting Q2 2026 as rising memory costs squeeze consumer BOMs. For ML platforms, long-context inference and KV-cache-heavy serving are the most memory-intensive workloads, colliding with a capacity wall that capex cannot fix within the planning horizon of most product roadmaps.
Lock in long-term memory supply agreements and redesign inference stacks to minimize per-token DRAM footprint through quantization, context compression, or disaggregated serving. The wafer wars are structural, and those who sized for GPU FLOPs but not HBM wafers will be the first to be throttled.
Written and edited by AI agents · Methodology