🐯Freshcollected in 25m

Google Cloud AI Surge Crushes Nvidia, Goldman Pivot

Google Cloud AI Surge Crushes Nvidia, Goldman Pivot
PostLinkedIn
🐯Read original on 虎嗅

💡Goldman: Overweight Google Cloud AI vs Nvidia chips—valuation reset underway

⚡ 30-Second TL;DR

What Changed

Alphabet added $421B market cap in one day, second largest US corporate gain.

Why It Matters

Highlights shift from AI chip hype to cloud platforms' real ROI, urging investors to rebalance portfolios amid hyperscaler capex scrutiny.

What To Do Next

Test Google Cloud's Gemini API for AI workloads to leverage its 63% growth and margin expansion.

Who should care:Enterprise & Security Teams

🧠 Deep Insight

AI-generated analysis for this event.

🔑 Enhanced Key Takeaways

  • Google's proprietary TPU v6 'Trillium' chips have achieved a 4.7x improvement in performance-per-watt compared to v5e, significantly lowering the inference cost for Gemini models and driving the reported margin expansion.
  • The surge in cloud backlog is largely attributed to the adoption of 'Vertex AI Agent Builder,' which allows enterprise clients to deploy custom AI agents without managing underlying infrastructure, locking in long-term service contracts.
  • Goldman Sachs' pivot reflects a broader shift in institutional sentiment where 'AI infrastructure' spending is transitioning from pure hardware procurement (GPUs) to software-defined AI platforms (Cloud AI services) that demonstrate immediate operational ROI.
📊 Competitor Analysis▸ Show
FeatureGoogle Cloud (Vertex AI)AWS (Bedrock)Microsoft Azure (OpenAI Service)
Primary ModelGemini 1.5 Pro/FlashClaude 3.5 / TitanGPT-4o
Custom HardwareTPU v6 (Trillium)Trainium2 / Inferentia2Maia 100
Pricing ModelToken-based / ReservedToken-based / ProvisionedToken-based / Reserved
Key AdvantageDeep integration with Google ecosystemLargest breadth of model choiceSeamless M365/Copilot integration

🛠️ Technical Deep Dive

  • Gemini 1.5 Pro Architecture: Utilizes a Mixture-of-Experts (MoE) architecture that dynamically activates only relevant parameters per token, enabling the 160B tokens/min throughput while maintaining a 2-million token context window.
  • TPU v6 (Trillium) Specs: Features 3rd-generation SparseCore technology specifically optimized for embedding-heavy workloads common in large-scale recommendation systems and LLM inference.
  • Infrastructure Optimization: Google implemented 'Multi-Slice' training, allowing the orchestration of thousands of TPUs across different data centers to act as a single unified supercomputer, reducing latency for massive model training.

🔮 Future ImplicationsAI analysis grounded in cited sources

Nvidia's data center revenue growth will decelerate below 20% YoY by Q4 2026.
As hyperscalers shift capital expenditure toward internal custom silicon (TPUs, Maia, Trainium), reliance on merchant silicon for inference workloads is structurally declining.
Google Cloud will achieve a 40% operating margin by the end of 2027.
The combination of proprietary hardware efficiency and high-margin software-as-a-service (SaaS) AI agent tools creates a compounding effect on profitability.

Timeline

2023-12
Google announces Gemini 1.0, marking the start of the unified AI model strategy.
2024-05
Google I/O 2024 introduces the Gemini 1.5 Pro model with a 1-million token context window.
2024-08
Google announces the general availability of TPU v6 'Trillium' for cloud customers.
2025-02
Google Cloud reports first-ever full-year profitability for the cloud division.
2026-04
Google Cloud reports 63% revenue growth, triggering the record-breaking stock rally.
📰

Weekly AI Recap

Read this week's curated digest of top AI events →

👉Related Updates

AI-curated news aggregator. All content rights belong to original publishers.
Original source: 虎嗅