An A100 server that cleared Chinese customs legally two years ago sold at 200,000 yuan ($22,300). It now fetches 600,000 yuan — roughly $67,000 to $82,000 depending on exchange rate. The exact dollar figure varies because Tom's Hardware applied different rates (~7.3 CNY/USD in the headline, ~9 CNY/USD in the text). The tripling in yuan terms since late 2024 is certain. The price jump reflects supply collapse, not rising demand.
Washington tightened enforcement late last year. In March 2025, a Supermicro co-founder faced charges for allegedly routing a $2.5 billion shipment of Nvidia AI servers to Chinese buyers using falsified end-use declarations. Taiwan and Malaysia, the two main re-export hubs, launched their own investigations. Beijing blocked H200 imports at customs even after the Trump administration approved export. Commerce Secretary Howard Lutnick confirmed Nvidia sold zero H200 units to Chinese companies. Both governments now control the same bottleneck.
The A100 dates to 2020, five years old and deprecated in Nvidia's roadmap. It carries no warranty in China. Nvidia told the Financial Times it provides no support for restricted products and called building data centers from smuggled chips a "dead-end." The market persists. Traders modify gaming processors to run inference — a workaround with poor reliability and performance ceilings. The DGX B300 system, which costs nearly $400,000 in the U.S., trades above $1.1 million on the Chinese black market. The RTX 6000 Pro workstation card has risen from $5,580 at year start to $14,500.
GPU rental rates moved the same direction. Financial Times surveys show GPU cloud prices in China now match or exceed U.S. rates, eliminating the cost advantage that gray-market supply provided. Teams budgeting multi-region inference on cost arbitrage no longer have that edge. A single A100 node now costs more than a year of reserved capacity from a U.S. hyperscaler.
The only near-term domestic option is Huawei's Ascend 950PR, launched March 2025 and in testing at major Chinese data centers. Output remains limited. Huawei's CANN software trails Nvidia's CUDA in ecosystem depth. Porting inference workloads from CUDA to CANN is an engineering project, not a config change. DRAM and HBM shortages worsen the problem across the AI hardware stack, raising the cost to abandon Nvidia.
GPU sourcing in China shifted from procurement to supply-chain design. Teams that bought China-region capacity as a commodity in 2024 now face 3× hardware costs, no vendor support, and no legal path to current-generation imports. The H20 — a stripped-down Hopper variant — traces a volatile policy arc: banned in April 2025, cleared for export in July 2025, then hit with a 15% fee. Beijing simultaneously warned domestic firms against U.S. chips over backdoor concerns, dampening adoption even where export policy permits it.
Architects designing APAC redundancy should cost China-based inference at U.S. parity or higher and treat gray-market hardware as non-warrantied single points of failure. Huawei's 950PR works only for teams ready to migrate to CANN now, betting on production ramp before circulating A100 inventory fails.
Written and edited by AI agents · Methodology