🔗Stalecollected in 10m

Meta Launches 4 New AI Chips: MTIA

Meta Launches 4 New AI Chips: MTIA
PostLinkedIn
🔗Read original on Wired AI

💡Meta's custom AI chips challenge Nvidia dominance—key for infra builders.

⚡ 30-Second TL;DR

What Changed

Meta developed 4 new MTIA processors

Why It Matters

Meta's custom chips could reduce dependency on Nvidia, cutting costs and securing AI supply chains for hyperscalers. This accelerates the trend of big tech designing specialized AI silicon. Practitioners gain potential new hardware options beyond GPUs.

What To Do Next

Monitor Meta's AI blog for MTIA benchmark releases to assess inference performance gains.

Who should care:Developers & AI Engineers

🧠 Deep Insight

Web-grounded analysis with 9 cited sources.

🔑 Enhanced Key Takeaways

  • Next-gen MTIA (v2) uses TSMC 5nm process, operates at 1.35GHz, and delivers 102.4 TFLOPS INT8 GEMM performance with 90W TDP[1][2].
  • MTIA v2 features an 8x8 PE grid with 3.5x dense and 7x sparse compute gains over v1, plus tripled local PE storage and doubled on-chip SRAM[1][2].
  • MTIA v2 supports inference only, not training, and platform-level tests show 6x serving throughput improvement over v1 systems[1][2].
  • MTIA-2 enters production on TSMC 3nm with CoWoS-S packaging and Broadcom support, debuting H1 2026; MTIA-3 follows in H2 2026[5].

🛠️ Technical Deep Dive

  • Architecture: 8x8 grid of 64 processing elements (PEs) connected via mesh network, supporting thread/data/instruction/memory-level parallelism[1][4][7].
  • Memory: 384KB local SRAM per PE (3x v1), 256MB on-chip SRAM (2x v1), up to 128GB off-chip LPDDR5; bandwidths: 1TB/s local per PE, 2.7TB/s on-chip, 176GB/s off-chip[1][2].
  • Performance: 102.4 TFLOPS/s INT8 GEMM, 51.2 TFLOPS/s FP16/BF16, 708 TOPS INT8 with sparsity; vector core 3.2 TFLOPS/s INT8[1][2].
  • Connectivity: 8x PCIe Gen4 (16GB/s host), TDP 90W (2.6x v1), die size 421mm² (25.6x16.4mm), TSMC 5nm[1][2].
  • Deployment: 12 accelerators per Yosemite V3 server with PCIe switches for inter-accelerator communication[4].

🔮 Future ImplicationsAI analysis grounded in cited sources

Meta's MTIA deployment will reach thousands of units globally by end-2026
Reports detail server chassis like Grand Teton for v2 Artemis and Santa Barbara for v3 Iris, with breakdowns by U.S. state and country[6].
Continued Nvidia reliance persists despite MTIA advances
Meta invests billions in Nvidia while scaling in-house inference chips, as MTIA remains inference-only without training capability[1][2].
TSMC 3nm MTIA-2/3 will enhance supply chain for Broadcom and GUC
MTIA-2 uses Broadcom compute/I/O services and CoWoS-S; MTIA-3 adds GUC for complex packaging due to larger reticle limiting wafer yield[5].

Timeline

2020-01
Designed first-generation MTIA v1 ASIC for internal inference workloads
2023-01
Deployed MTIA v1 in Yosemite V3 servers with 12 accelerators each
2024-04
Announced MTIA v2 with TSMC 5nm, 1.35GHz clock, and sparsity support
2025-03
MTIA-3 taped out on TSMC 3nm with additional SoC and complex I/O
2026-01
MTIA-2 entered production on TSMC 3nm for H1 2026 debut

📰 Event Coverage

📰

Weekly AI Recap

Read this week's curated digest of top AI events →

👉Related Updates

AI-curated news aggregator. All content rights belong to original publishers.
Original source: Wired AI