Qualcomm reveals HBC near-memory AI architecture with 6x bandwidth per watt vs HBM
Qualcomm introduced its HBC (high-bandwidth compute) near-memory compute architecture, designed to address the memory wall bottleneck in AI workloads. The company disaggregates the AI accelerator from the system-on-chip (SoC) and places it directly beneath an LPDDR DRAM stack, connected via through-silicon vias. Qualcomm claims HBC delivers 6x higher bandwidth per watt compared to HBM (high-bandwidth memory) and over 200x capacity compared to on-chip SRAM, without requiring expensive advanced packaging or HBM stacks.
The architecture eliminates congestion and cost penalties of high-bandwidth memory by combining DRAM density with the latency characteristics of SRAM using standard packaging. Multiple HBC stacks can be deployed within a single compute device, offering significant performance-per-dollar advantages. HBC will debut with the AI250 accelerator (relying on 1st Gen HBC, offering 18x bandwidth increase over the prior AI200), followed by AI300 with 2nd Gen HBC providing 54x bandwidth scaling.
Qualcomm's approach differs from conventional DRAM-on-logic designs by placing specialized compute dies directly beneath stacked LPDDR, avoiding the exotic materials and packaging of HBM solutions. The company's AI200 accelerator is due later this year with 43 TB of RAM per rack using LPDDR5X. The roadmap reflects Qualcomm's data center diversification beyond mobile processors.
For infrastructure teams, this addresses a real constraint: memory bandwidth growth has fallen behind compute capability. HBC's bandwidth-per-watt claims and cost profile make it competitive for inference servers where memory is the bottleneck. The multi-generation roadmap signals Qualcomm's commitment to scaling as competitors (NVIDIA, AMD, custom solutions) also compete on memory efficiency in the next wave of accelerator designs.
Sources
- Primary source
- eetimes.com
“Qualcomm data center business expanding with HBC technology roadmap”