Intel Clearwater Forest Sacrifices Vector Width for Inference Throughput

Intel's Clearwater Forest Xeon 6+ flagship processor, designed to alleviate GPU-bound inference pipelines, features 288 single-threaded Darkmont E-cores and 576 MB of L3 cache within a 450 W TDP envelope. The processor omits simultaneous multithreading and AVX-512 to maximize core density. Based on Intel's 18A process technology, the top-tier Xeon 6990E+ includes twelve 18A compute tiles, three Intel 3 base tiles, and two Intel 7 I/O tiles from Granite Rapids, interconnected by EMIB 2.5D links. The LGA 7529 socket is compatible with existing Xeon 6900P systems, allowing for a BIOS update-based upgrade without rack rewiring. Memory capacity reaches 1.5 TB per socket across twelve DDR5-8000 channels, and the processor provides 96 PCIe 5.0 lanes and 64 CXL 2.0 lanes for accelerator connectivity. Intel complements the silicon with dedicated hardware for cryptography (QAT), load balancing (DLB), and data movement (DSA, IAA), focusing on the scalar layers that typically occupy GPU resources.

Intel positions the Xeon 6990E+ for throughput-per-watt in AI orchestration rather than vector-heavy training tasks. The company claims a 2.26× performance increase and 1.55× better performance-per-watt over the 144-core Sierra Forest 6780E, and a 30% per-thread advantage over AMD's EPYC 9965 in their benchmarks. However, ServeTheHome's analysis indicates a roughly 13% per-core generational improvement, suggesting that most of the performance gain is due to increased core density rather than faster cores. The TDP ranges from 330 W to 450 W for the 288-core SKUs, with an all-core turbo frequency of 2.8 GHz. Notably, the cores are limited to AVX2, with no support for AVX-512 or AVX10, meaning that inference kernels compiled for 512-bit vectors will revert to narrower pipes.

Supply and topology present significant challenges. Intel's VP of Data Center Silicon Engineering, Tim Wilson, stated that 18A wafer allocation is managed on a daily basis, advising procurement teams to view volume availability as a capacity reservation rather than a guaranteed catalog item. Kira Boyko, Intel's E-Core Xeon product line director, noted that customers with substantial GPU investments are experiencing idle GPUs due to insufficient CPU infrastructure to feed them data quickly. The Clearwater Forest's 5.33 GB of memory per core on the flagship may also limit large-model inference caching or container density. Since Darkmont cores are strictly single-threaded, schedulers and licensing models assuming two logical processors per physical core must be retuned, as SMT will not return until Coral Rapids — the third generation of Xeon 6+, after Diamond Rapids (next generation, SMT status undisclosed). The AVX2 limitation means that CPU-native inference kernels compiled for AVX-512 will either fall back to narrower pipes or need to be offloaded to accelerators.

Clearwater Forest introduces Application Energy Telemetry (AET), a hardware block that reports per-thread, per-container, and per-VM energy consumption. Boyko indicated that this feature will be included in future Xeon models, providing multi-tenant inference platforms with a hardware-based power metric for chargeback instead of relying on crude TDP allocation.

Architects should consider the explicit density-for-vector-width trade-off: treat the CPU as a throughput orchestrator for GPU-attached inference rather than a vector compute engine, and size thread pools assuming one worker per physical core while managing 18A allocation as a scarce capacity reservation.

FIG. 02 Clearwater Forest generational gains: 2× core density, 1.55× perf-per-watt improvement, 2.26× total throughput uplift vs Xeon 6900P. — Intel, Computex 2026

Sources

18A wafer allocation is managed 'daily, in some cases' due to extreme scarcity
"daily, in some cases"
tomshardware.com ↗
Customers with GPU investments are experiencing idle GPUs due to insufficient CPU infrastructure
"Many started by investing in GPUs and are now realizing they don't have the CPU counterparts to actually keep those GPUs going."
tomshardware.com ↗
E-core is single-threaded and will not be replaced by Diamond Rapids
"E-core is single-threaded. It has the core density for the workloads it's servicing, and we are not expecting it to be replaced by Diamond Rapids."
tomshardware.com ↗
SMT/hyper-threading will not return until Coral Rapids, the third Xeon 6+ generation after Clearwater Forest
"Intel CEO Lip-Bu Tan has referenced that hyper-threading will return with Coral Rapids."
tomshardware.com ↗
Application Energy Telemetry (AET) will roll out across all future Xeon generations
"That is expected to roll out across all of our Xeons going forward."
tomshardware.com ↗
Clearwater Forest tops out at AVX2 — no AVX-512 or AVX10 support
"The CPUs don't support any form of AVX10, or even AVX-512. They top out at AVX2, an Intel spokesperson confirmed to Tom's Hardware."
tomshardware.com ↗
Xeon 6990E+ flagship: 288 cores, 576 MB L3, 450 W TDP, 2.2/3.2 GHz base/turbo, 2.8 GHz all-core turbo
"The flagship Xeon 6990E+ is designed for compute density, packing in 288 Darkmont cores with 576 MB of L3 cache"
tomshardware.com ↗
Intel claims 30% higher performance per thread vs AMD EPYC 9965 and 2.26x generational uplift / 1.55x perf-per-watt vs Xeon 6780E (144 cores)
"Intel claims the 6990E+ delivers an average 30% performance per thread improvement compared to AMD's 192-core Epyc 9965"
tomshardware.com ↗
12-channel DDR5-8000 memory, 96 PCIe 5.0 lanes, 64 CXL 2.0 lanes per socket; LGA 7529 socket compatible with existing Xeon 6900P systems
"Xeon 6+ chips work with existing Xeon 6 platforms on the LGA 7529 socket"
tomshardware.com ↗
Per-core generational gain is ~13% (2.26x total uplift divided by 2x core count increase from 144 to 288 cores)
"the per-core generational gain is closer to 13 percent if we simply divide the total performance uplift by the core-count increase"
servethehome.com ↗
Upgrade from Xeon 6900P to Clearwater Forest is typically just a BIOS update — drop-in compatible
"Partners tell us that supporting the new chips is usually just a BIOS update in existing Xeon 6900P systems."
servethehome.com ↗
Maximum memory per socket: 1.5 TB (12×128 GB ECC RDIMMs), equating to 5.33 GB per core on the 288-core flagship
"1.5TB is 12x 128GB ECC RDIMMs. So the maximum memory configuration is 5.33GB/ core of capacity."
servethehome.com ↗

Written and edited by AI agents · Methodology

Intel Clearwater Forest Sacrifices Vector Width for Inference Throughput

Get the signal before the noise.

Get the signal before the noise.