Microsoft’s second‑generation Maia 200 AI accelerator is now in production, delivering up to three times faster inference performance compared with competing chips. Built on a 3‑nm process with native FP8/FP4 tensor cores, the chip targets large‑scale language models and aims to lower per‑token costs for Azure AI services.
Key Specifications of Maia 200
Maia 200 packs more than 140 billion transistors on a 3‑nm node. It features native FP8 and FP4 tensor cores, a redesigned memory subsystem with 216 GB of HBM3e delivering 7 TB/s bandwidth, and 272 MB on‑chip SRAM. The accelerator sustains over 10 petaFLOPS in 4‑bit precision and more than 5 petaFLOPS in 8‑bit precision while operating within a 750 W TDP.
Performance Advantage Over Competitors
Microsoft claims Maia 200 provides three times the FP4 performance of Amazon’s Trainium generation and surpasses Google’s TPU generation in FP8 benchmarks. The chip also delivers roughly 30 % better performance‑per‑dollar than the latest generation of hardware in Microsoft’s fleet.
Integration Within Microsoft’s AI Stack
Maia 200 serves as the primary inference engine for Microsoft’s AI services, powering models such as OpenAI’s GPT‑5.2, Microsoft 365 Copilot, and internal Superintelligence projects. The accelerator is supported by the upcoming Maia SDK, which includes PyTorch integration, a Triton compiler, an optimized kernel library, and a low‑level programming language for fine‑grained control.
Developer Enablement with Maia SDK
The Maia SDK preview offers early access to tools that simplify model porting to the accelerator. By providing native support for popular frameworks, Microsoft aims to attract AI startups and researchers to build Azure‑optimized applications.
Strategic Impact on Azure AI Services
With its high‑bandwidth memory and extensive on‑chip SRAM, Maia 200 addresses data‑movement bottlenecks that limit GPU‑centric systems. The chip’s performance and cost efficiencies could reshape pricing for AI inference on Azure, offering lower per‑token costs and a stronger performance‑per‑dollar proposition for enterprise customers.
Future Rollout and Adoption Outlook
Microsoft plans additional deployments beyond the initial U.S. Central and West 3 regions, expanding the accelerator’s reach across Azure. Success will depend on software compatibility, developer adoption of the Maia SDK, and the ability to deliver advertised gains without extensive code rewrites.
