OpenAI unveils Jalapeño inference chip with Broadcom, targets late-2026 deployment
OpenAI and Broadcom on Wednesday unveiled Jalapeño, OpenAI's first custom AI accelerator chip designed specifically for large language model inference. The companies claim early internal testing shows substantially better performance per watt than current state-of-the-art systems, though final benchmarks haven't been released. The chip was developed from design to tape-out in nine months, an unusually fast turnaround that OpenAI attributes to using its own models to accelerate parts of the hardware design process.
Jalapeño is a purpose-built ASIC with a massive compute chiplet (~840mm² reticle-sized die) surrounded by six HBM memory modules and optimized for low latency and high throughput inference. Unlike general-purpose GPUs, the architecture is tuned around LLM serving patterns, memory movement, and networking efficiency—balancing compute, memory, and I/O to operate closer to theoretical peak utilization. Broadcom handles silicon manufacturing and contributes its Tomahawk networking silicon; Celestica provides board and rack integration.
Deployment begins at gigawatt scale in late 2026 through Microsoft and other partners, with initial prototype production in late 2026 scaling through the years ahead. OpenAI President Greg Brockman told CNBC that OpenAI cannot get compute fast enough, underscoring the infrastructure pressure driving the partnership. Broadcom CEO Hock Tan noted the compute demand from the company's six hyperscaler customers is insatiable and expected to remain elevated through 2028.
For AI architects, Jalapeño signals OpenAI's move to own the full stack—from models to inference hardware—to reduce costs and latency on serving. This matters because OpenAI controls both the workload and silicon, enabling tighter hardware-software co-optimization than off-the-shelf GPUs can deliver. The nine-month design cycle and gigawatt-scale plans suggest a credible alternative to NVIDIA's dominance in inference infrastructure, though hard performance numbers are still pending.
Sources
- Primary source
- openai.com
“Jalapeño was delivered to OpenAI CEO Sam Altman and President Greg Brockman by Broadcom President and CEO Hock Tan and President Charlie Kawwas, marking an important step in OpenAI's strategy to build the full stack behind its models and products.”
- cnbc.com
“OpenAI President Greg Brockman told CNBC's David Faber on Wednesday that the chips were designed from end to end in nine months with help from the company's own models. Brockman told CNBC that OpenAI 'cannot get compute fast enough,' and Broadcom CEO Hock Tan backed up that take, saying compute demand from the company's six customers is 'simply insatiable.'”
- venturebeat.com
“Jalapeño's engineering timeline set a blistering pace for the semiconductor industry, moving from early schematics to fabrication readiness within a brief nine-month window, when new processor development cycles are typically measured in years.”