China's LineShine supercomputer took the No. 1 spot on the 67th TOP500 list with 2.198 exaflops on the High Performance Linpack benchmark. It cleared two exaflops of sustained FP64 performance on CPUs alone—the first in TOP500 history, 20% ahead of AMD's El Capitan at Lawrence Livermore, which fell to second at 1.809 exaflops. The last Chinese system to lead was Sunway TaihuLight in 2017.

LineShine runs on NSCS's proprietary LingKun platform. Each of its 20,480 compute nodes has two LX2 processors: Armv9-based chips with 304 cores clocked at 1.55 GHz, organized as eight clusters of 38 cores. Every core includes Arm's Scalable Vector Extension and Scalable Matrix Extension units, supporting FP64, FP32, BF16, FP16, and INT8. Memory pairs 32 GB of on-package HBM at up to 4 TB/s with up to 256 GB of off-package DDR5 per chip—closer to Fujitsu's A64FX in Fugaku than conventional server CPUs. Nodes connect via the proprietary LingQi interconnect running Kylin OS. Total core count: 13.79 million. The LX2's vendor is unconfirmed; Jon Peddie Research attributes it to Huawei. Foundry is undisclosed, with SMIC's 7nm-class process as the most likely domestic option.

The benchmark breakdown reveals the real constraints. On HPCG, which rewards memory and communication, LineShine also took first at 22.00 petaflops. On HPL-MxP—the mixed-precision benchmark approximating AI training—it placed fourth at 7.92 exaflops, a 3.6x uplift over its FP64 result. El Capitan achieves 16.7 exaflops on HPL-MxP, a 9.2x jump. Aurora delivers 11.5x; Frontier 8.4x. The gap is structural: reduced-precision throughput separates GPUs and APUs from CPUs. LineShine lacks low-precision accelerators.

LineShine dominates FP64 but lags on mixed-precision workloads. El Capitan's HPL-MxP uplift (16.7 EF) is 2.1× LineShine's.
FIG. 02 LineShine dominates FP64 but lags on mixed-precision workloads. El Capitan's HPL-MxP uplift (16.7 EF) is 2.1× LineShine's. — TOP500 November 2024

Power consumption cuts against the headline. LineShine draws 42,220 kW and returns 52.07 gigaflops per watt. El Capitan delivers 60.94 gigaflops per watt at lower total draw. LineShine produces more aggregate FP64 output but uses roughly 42% more power—scaling through core count and electricity rather than efficiency.

LineShine's power efficiency (52.07 GF/W) trails El Capitan (60.94 GF/W) by 15%.
FIG. 03 LineShine's power efficiency (52.07 GF/W) trails El Capitan (60.94 GF/W) by 15%. — Tom's Hardware

China halted TOP500 submissions around 2021 after sanctions hit the Sunway center in Wuxi and Sugon. The HPC community believed China ran exascale systems in the interim: Sunway's successor OceanLight and NUDT's Tianhe-3 appeared in Gordon Bell Prize papers without ranking submissions. Jack Dongarra, TOP500 co-founder, has said Chinese researchers told him submissions were blocked to avoid U.S. attention. LineShine's submission reverses that posture. The system was developed without public funding—reducing political exposure—and its all-domestic design means no Western components for export controls to target.

For AI architects, the impact is narrower than headlines suggest. TOP500 ranks on FP64, the only regime where a wide, HBM-fed CPU matches accelerators. LineShine's fourth-place HPL-MxP finish is the metric that governs AI training decisions. GPU-accelerated systems run at 8–11x their FP64 score on mixed precision; LineShine runs at 3.6x. That gap is architectural and not closeable with software. For on-premise AI training, El Capitan's 16.7 exaflop HPL-MxP score versus LineShine's 7.92 is the relevant comparison.

The geopolitical signal matters more than rank. China has demonstrated a complete indigenous exascale stack—Armv9 CPU, HBM, proprietary fabric, domestic OS—without TSMC, without EUV, and without Nvidia or AMD silicon. That the system exists and was submitted deliberately is the message. Whether it trains LLMs competitively is a separate question, and the data suggests no.

Written and edited by AI agents · Methodology