๐Ÿ‘ฅStalecollected in 30m

Meta Plans Four New MTIA Generations

Meta Plans Four New MTIA Generations
PostLinkedIn
๐Ÿ‘ฅRead original on Meta Newsroom

๐Ÿ’กMeta's 4 new AI chips in 2 yrs accelerate custom silicon for AI infra

โšก 30-Second TL;DR

What Changed

MTIA custom silicon central to Meta's AI infrastructure

Why It Matters

Meta's push strengthens in-house AI hardware, potentially pressuring Nvidia dominance and spurring efficiency gains. AI practitioners gain insights into scalable custom silicon trends for large-scale deployments.

What To Do Next

Monitor Meta engineering blog for MTIA benchmark releases to evaluate vs. Nvidia GPUs.

Who should care:Developers & AI Engineers

๐Ÿง  Deep Insight

Web-grounded analysis with 6 cited sources.

๐Ÿ”‘ Enhanced Key Takeaways

  • โ€ขFirst-generation MTIA, announced May 18, 2023, was fabricated on TSMC 7nm process, operates at 800 MHz with 102.4 TOPS INT8 and 25W TDP, targeting recommendation system inference[1][2][3].
  • โ€ขNext-generation MTIA uses TSMC 5nm process, clocks at 1.35 GHz with 90W TDP, deployed in rack systems holding up to 72 accelerators, achieving 3x performance improvement over v1[4].
  • โ€ขMTIA development began in 2020, with chips received as early as 2021; features 64 PEs in 8x8 grid, 128 MB on-chip SRAM at 800 GB/s bandwidth, and up to 128 GB LPDDR5 off-chip[1][3].

๐Ÿ› ๏ธ Technical Deep Dive

  • โ€ขMTIA v1: TSMC 7nm, 800 MHz, 102.4 TOPS INT8 / 51.2 TFLOPS FP16, 25W TDP, 128 MB SRAM (800 GB/s), up to 128 GB LPDDR5 (176 GB/s), 8 PCIe 4.0 lanes, 64 PEs in 8x8 mesh[1][2][3].
  • โ€ขArchitecture: 64 Processing Elements (PEs) each with 128 KB local SRAM, supports TLP/DLP/ILP/MLP, mesh network for inter-PE and memory connectivity[3].
  • โ€ขNext-gen (v2): TSMC 5nm, 1.35 GHz, 90W TDP, 1.12B gates, 373 mmยฒ die area, on-chip 128 MB (800 GB/s), off-chip 64 GB LPDDR5 (176 GB/s), deployed in 72-accelerator racks with 6x throughput gain[4].
  • โ€ขDeployment: Yosemite V3 servers with 12 accelerators per server using PCIe switches for inter-accelerator communication bypassing host CPU[3].

๐Ÿ”ฎ Future ImplicationsAI analysis grounded in cited sources

Meta will reduce dependence on Nvidia GPUs for inference by 2028
MTIA is deployed at scale for ads/ranking workloads with efficiency gains over vendor silicon, alongside a training chip ramping up and multiple chips in development[6].
Four new MTIA generations will enable denser AI clusters by 2028
Next-gen already supports 72 accelerators per rack at higher clock/power for broader model sizes, aligning with plans for multiple chips and Hyperion 5GW cluster[4][6].

โณ Timeline

2020-01
MTIA development initiated for internal AI workloads
2021-08
First MTIA chips fabricated on TSMC 7nm and received by Meta
2023-05
MTIA v1 publicly announced at AI Infra @ Scale event
2023-06
MTIA v1 paper presented at ISCA conference
2024-01
Next-generation MTIA v2 first silicon to production in under 9 months
2025-09
MTIA confirmed deployed at scale for ads workloads with training chip ramping

๐Ÿ“ฐ Event Coverage

๐Ÿ“ฐ

Weekly AI Recap

Read this week's curated digest of top AI events โ†’

๐Ÿ‘‰Related Updates

AI-curated news aggregator. All content rights belong to original publishers.
Original source: Meta Newsroom โ†—