China Scales Domestic GPU Clusters; Moore Threads, Huawei, Alibaba Each Deploy 10,000-Card Infrastructure
China is rapidly expanding homegrown AI compute infrastructure, with multiple domestic chip makers now operating 10,000-GPU clusters to reduce dependence on export-restricted NVIDIA silicon and build sovereign AI capacity. Moore Threads unveiled its 'Kua E' 10,000-GPU intelligent computing cluster with 10 Exa-Flops floating-point computing capability and reported 60% MFU (model FLOPs utilization) on dense large-model training and 40% on mixture-of-experts models. Shenzhen activated China's first 10,000-card cluster powered by Huawei Ascend 910C AI chips, delivering 11,000 petaflops of computing power. Alibaba has also launched a 10,000-card cluster with T-Head Zhenwu chips, with 92% booking rate among roughly 50 institutes that signed framework agreements.
These clusters represent a strategic shift away from reliance on NVIDIA H100s and H200s—restricted under US export controls—toward indigenous chip stacks optimized for China's fragmented internet ecosystem and state AI priorities. Huawei's Ascend 910C achieves roughly 60% of NVIDIA H100 inference performance, sufficient for large-scale deployment. T-Head's Zhenwu 810E (unveiled January 2026) advertises performance comparable to the NVIDIA H20 (a deliberately weakened export-control-compliant chip). Cambricon's Siyuan 590/690 lineup, Baidu's Kunlun P800, and MetaX's C600 all now ship in multi-thousand-unit clusters, with CATL and domestic hyperscalers coordinating power and cooling infrastructure.
The mega-cluster buildout reflects broader capacity expansion: Huawei targets 600,000 Ascend 910C units in 2026; Cambricon aims for 500,000 chips (300K Siyuan 590/690 combined); Alibaba shipped ~270,000 Zhenwu chips in 2025 and is ramping faster. Collectively, Chinese AI chipmakers shipped an estimated 500K–800K domestic AI chips in 2025. The state-backed National AI Industry Fund has prioritized infrastructure buildout, and regional governments are offering power and land deals aligned with industrial policy.
For practitioners: China's cluster parity on sheer scale and organization is eroding US cost advantages in frontier-scale training. If your model or application must run inference at continental scale and you have data-residency constraints or seek to reduce vendor lock-in with Western APIs, monitor these Chinese clusters for availability, pricing, and performance. Expect continued price compression and expanded model choice. For Western teams, this signals an accelerating shift in the geography of AI training: the era of singular US-dominated compute dominance is giving way to multiple regional stacks with differentiated cost profiles and sovereignty tradeoffs.
Sources
- Primary source
- huaweicentral.com
“China's first 10000 AI card cluster is capable of delivering a computing power of 11000 petaflops”
- the-substrate.net
“Cambricon shipped an estimated 100,000-200,000 Siyuan 590s in 2025 and is targeting roughly 500,000 chips in 2026”
- globaldatacenterhub.com
“China is preparing the conditions to consolidate over 1 million GPUs into centralized AI clusters by 2026”