Skip to content
AI Jun 24, 2026 6 min read

OpenAI's First Custom Chip: 'Jalapeño,' Built With Broadcom for LLM Inference

OpenAI and Broadcom unveiled Jalapeño on June 24, 2026 — OpenAI's first custom AI accelerator, designed in-house for LLM inference and manufactured by Broadcom. Co-developed from design to tape-out in nine months, it reportedly cuts inference costs ~50% versus typical AI GPUs, with first deployment targeted by end of 2026. Here's what it means for OpenAI's Nvidia reliance.

D

DevCraftly Team

DevCraftly

Share
OpenAI's First Custom Chip: 'Jalapeño,' Built With Broadcom for LLM Inference
OpenAI's First Custom Chip: 'Jalapeño,' Built With Broadcom for LLM Inference

OpenAI just moved into silicon. On June 24, 2026, OpenAI and Broadcom unveiled “Jalapeño,” OpenAI’s first custom AI accelerator — an “Intelligence Processor” architected in-house specifically for LLM inference and built by Broadcom. The chip reportedly delivers around 50% cost savings versus typical AI GPUs, and it’s the first of a multi-generation compute platform the two companies are building together.

Fast-moving story. Performance, cost and timeline figures below reflect the June 24, 2026 announcement and company statements. Real-world results depend on deployment at scale — treat the numbers as point-in-time vendor claims until independently benchmarked.

At a glance

Detail
NameJalapeño — OpenAI’s first “Intelligence Processor”
Partner / manufacturerBroadcom
PurposeLLM inference (serving ChatGPT and other models)
Cost claim~50% cheaper than typical AI GPUs (per Broadcom CEO Hock Tan)
Design cycleConcept to tape-out in ~9 months
First deploymentTargeted by end of 2026, expanding in following years
Strategic goalReduce reliance on external GPU suppliers (e.g., Nvidia)

What happened

OpenAI and Broadcom jointly announced Jalapeño, described as an accelerator “architected around OpenAI’s vision for the future of LLM inference” — and the first AI accelerator in a multi-generation platform the companies plan to build together. The chips will be manufactured by Broadcom and used by OpenAI for inference: the compute-intensive job of serving models to users in ChatGPT and other applications.

According to Broadcom CEO Hock Tan, the accelerator is showing cost savings of roughly 50% compared with typical AI GPUs. The companies are aiming for initial deployment by the end of 2026, “expanding in the years ahead,” and tie it to bringing gigawatt-scale data centers online with Microsoft and other partners.

The nine-month sprint

One of the most striking claims is speed. Jalapeño went from initial design to manufacturing tape-out in just nine months — what the companies call one of the fastest ASIC development cycles ever for high-performance advanced semiconductors.

They credit deep software-hardware co-development: OpenAI’s engineering teams shaping the architecture around its own models, Broadcom’s silicon-implementation expertise, and — notably — the use of OpenAI’s own models to accelerate parts of the design and optimization process. In other words, AI helping design the chip that will run AI.

Why inference is the target

OpenAI didn’t aim its first chip at training — it aimed at inference, and that’s the tell. Training is bursty and research-driven; inference is the relentless, every- request cost of running a product used by hundreds of millions of people. Shaving ~50% off the cost of serving each token compounds enormously at OpenAI’s scale.

By designing the accelerator narrowly around how its own models actually run, OpenAI can strip out general-purpose overhead that a do-everything GPU has to carry — trading flexibility for efficiency on the one workload that dominates its bill.

Why it matters

1. Less dependence on a single supplier. Custom silicon is OpenAI’s clearest move yet to reduce reliance on external hardware providers like Nvidia for its core infrastructure — diversifying supply and gaining leverage on price and roadmap.

2. Pressure on GPU pricing power. A credible, ~50%-cheaper inference chip from one of the largest AI buyers puts the incumbent’s pricing power on notice. Even if OpenAI keeps buying GPUs, having its own option changes the negotiation.

3. A bet on owning the stack. Pairing custom chips with gigawatt-scale data centers signals OpenAI wants to control more of its compute destiny end-to-end — the same vertical-integration playbook the largest hyperscalers have run with their own in-house accelerators.

Bottom line

Jalapeño is OpenAI’s first real step into hardware: an in-house-designed, Broadcom-built inference chip, taped out in a remarkable nine months, claiming ~50% cost savings over typical AI GPUs, with first deployment targeted for end of 2026. The strategic message is louder than the spec sheet — OpenAI wants to own more of its compute and lean less on outside suppliers. The real test comes when these chips run production traffic at scale; until then, the cost and performance claims are the company’s to prove.


Sources: OpenAI and Broadcom announcements and press coverage from June 24, 2026, including CNBC, Bloomberg and The Decoder. Performance and cost figures are company- stated and point-in-time; await independent benchmarks before relying on them.

#openai #broadcom #ai-chips #inference #semiconductors #nvidia #hardware #industry
Keep reading
Get in touch

Have a project or an idea?

We don't just write about software — we build it. Tell us what you're working on and we'll get back within 1–2 business days.