🔥Stalecollected in 49m

Nvidia Custom Chip for OpenAI Inference

Nvidia Custom Chip for OpenAI Inference
PostLinkedIn
🔥Read original on 36氪
#inference#chip#gtcnvidia-ai-processor

💡Nvidia's Groq-integrated inference chip for OpenAI; GTC launch soon.

⚡ 30-Second TL;DR

What Changed

New system for AI inference user requests

Why It Matters

Enhances AI efficiency, solidifying Nvidia-OpenAI partnership. May reshape inference hardware competition.

What To Do Next

Register now for Nvidia GTC to demo the inference platform.

Who should care:Developers & AI Engineers

🧠 Deep Insight

Web-grounded analysis with 3 cited sources.

🔑 Enhanced Key Takeaways

  • Nvidia's agreement with Groq is valued at $20 billion and includes an acqui-hire of founder Jonathan Ross, formerly of Google's TPU team, and other key engineers.[1][3]
  • The deal is a non-exclusive licensing of Groq's LPU inference technology, allowing Groq to continue operating independently under new CEO Simon Edwards.[2]
  • Groq's LPUs feature a 144-way VLIW tensor-streaming processor with only on-chip SRAM (230 MB per chip), optimized for low-latency single-user inference at batch size one.[1][3]

🛠️ Technical Deep Dive

  • Groq LPU uses a 144-way VLIW design built at GlobalFoundries, contrasting with Google's 8-way VLIW systolic arrays in TPUs, enabling cheap scaling but limited to 230 MB SRAM per chip with no external DDR or HBM.[3]
  • Static scheduling and tensor-streaming eliminate memory bottlenecks via ultra-fast on-chip SRAM, excelling in sequential low-latency tasks for real-time AI like chatbots.[1]
  • Running Llama 70B requires 10 racks and over 100 kW due to SRAM constraints; second-gen chip planned on Samsung 4nm for 2025 revenue but not yet evidenced in market.[3]

🔮 Future ImplicationsAI analysis grounded in cited sources

Nvidia will launch a dedicated inference accelerator card by Q3 2026
Podcast analysis details Nvidia's roadmap to release this hardware integrating Groq LPU tech post-deal.[1]
Integration of Groq LPU into Nvidia Blackwell architecture will enable hybrid GPU-LPU systems
The strategic agreement aims to blend high-throughput GPUs with low-latency LPUs for unified AI compute platforms.[1]
Antitrust scrutiny will intensify on Nvidia's AI dominance
The $20B deal absorbing Groq talent and tech raises concerns as Nvidia consolidates inference leadership over rivals like AMD.[1]

Timeline

2019
Groq releases first LPU chip with 144-way VLIW architecture.
2024
Groq announces second-gen chip plans on Samsung 4nm for 2025 revenue.
2025-02
Groq secures $1.5B commitment from Saudi Arabia for LPU expansion.
2025-12
Nvidia and Groq announce $20B non-exclusive licensing deal and acqui-hire of key team.
📰

Weekly AI Recap

Read this week's curated digest of top AI events →

👉Related Updates

AI-curated news aggregator. All content rights belong to original publishers.
Original source: 36氪