OpenAI Secures $10B Wafer‑Scale AI Deal with Cerebras

OpenAI has signed a multi‑year agreement with Cerebras Systems to provision over 750 MW of wafer‑scale compute for its AI models through 2028. The $10 billion contract grants access to Cerebras’s CS‑3 and upcoming CS‑4 platforms, delivering unprecedented inference speed and energy efficiency for OpenAI’s next‑generation services.

Deal Overview

The partnership provides OpenAI with dedicated access to Cerebras’s Wafer‑Scale Engine 3 (WSE‑3) based CS‑3 system and the future CS‑4 platform. By integrating an entire silicon wafer into a single processor, the architecture places full model weights in on‑chip SRAM, achieving a reported memory bandwidth of 21 petabytes per second and eliminating the traditional “memory wall.”

Performance Advantages

Benchmark tests show OpenAI’s latest reasoning model, GPT‑OSS‑120B, reaching 3,045 tokens per second on Cerebras hardware—approximately five times the throughput of leading GPU clusters. The time‑to‑first‑token for complex queries drops to under 300 milliseconds, enabling real‑time, chain‑of‑thought reasoning without multi‑second pauses.

Why OpenAI Needed New Compute

OpenAI has identified compute shortages as a bottleneck for product rollouts. The company plans to invest heavily in data‑center capacity, and the wafer‑scale solution offers a faster, more energy‑efficient alternative to GPU‑centric designs. By delivering 5× faster inference performance than current GPU offerings, the deal helps OpenAI maintain rapid deployment of sophisticated, reasoning‑heavy models.

Implications for the AI Ecosystem

  • Diversified hardware supply chain – Reduces reliance on a single type of accelerator.
  • Shift toward specialized inference hardware – Highlights the growing importance of latency and energy efficiency as models scale.
  • Benchmark for future commitments – The 750 MW of compute sets a new standard for multi‑year hardware contracts.
  • Enabler of real‑time AI agents – Sub‑300 ms response times allow developers to embed advanced decision‑making capabilities without sacrificing user experience.

Future Outlook

The OpenAI‑Cerebras agreement runs through 2028, a period during which both companies expect rapid advances in wafer‑scale engineering. Cerebras’s upcoming CS‑4 system is expected to further increase on‑chip memory and bandwidth, while OpenAI plans to leverage the secured compute to power the next generation of models, including the anticipated GPT‑5. As the AI compute race intensifies, wafer‑scale chips may become a cornerstone of high‑performance, cost‑effective inference.

YOUR LEGACY

YOUR LEGACY

What would you give to hear your grandmother's or your parents voice one more time? Give that gift to the people you love.

Start Recording For Free