🐯虎嗅•Freshcollected in 25m
Google Cloud AI Surge Crushes Nvidia, Goldman Pivot

💡Goldman: Overweight Google Cloud AI vs Nvidia chips—valuation reset underway
⚡ 30-Second TL;DR
What Changed
Alphabet added $421B market cap in one day, second largest US corporate gain.
Why It Matters
Highlights shift from AI chip hype to cloud platforms' real ROI, urging investors to rebalance portfolios amid hyperscaler capex scrutiny.
What To Do Next
Test Google Cloud's Gemini API for AI workloads to leverage its 63% growth and margin expansion.
Who should care:Enterprise & Security Teams
🧠 Deep Insight
AI-generated analysis for this event.
🔑 Enhanced Key Takeaways
- •Google's proprietary TPU v6 'Trillium' chips have achieved a 4.7x improvement in performance-per-watt compared to v5e, significantly lowering the inference cost for Gemini models and driving the reported margin expansion.
- •The surge in cloud backlog is largely attributed to the adoption of 'Vertex AI Agent Builder,' which allows enterprise clients to deploy custom AI agents without managing underlying infrastructure, locking in long-term service contracts.
- •Goldman Sachs' pivot reflects a broader shift in institutional sentiment where 'AI infrastructure' spending is transitioning from pure hardware procurement (GPUs) to software-defined AI platforms (Cloud AI services) that demonstrate immediate operational ROI.
📊 Competitor Analysis▸ Show
| Feature | Google Cloud (Vertex AI) | AWS (Bedrock) | Microsoft Azure (OpenAI Service) |
|---|---|---|---|
| Primary Model | Gemini 1.5 Pro/Flash | Claude 3.5 / Titan | GPT-4o |
| Custom Hardware | TPU v6 (Trillium) | Trainium2 / Inferentia2 | Maia 100 |
| Pricing Model | Token-based / Reserved | Token-based / Provisioned | Token-based / Reserved |
| Key Advantage | Deep integration with Google ecosystem | Largest breadth of model choice | Seamless M365/Copilot integration |
🛠️ Technical Deep Dive
- Gemini 1.5 Pro Architecture: Utilizes a Mixture-of-Experts (MoE) architecture that dynamically activates only relevant parameters per token, enabling the 160B tokens/min throughput while maintaining a 2-million token context window.
- TPU v6 (Trillium) Specs: Features 3rd-generation SparseCore technology specifically optimized for embedding-heavy workloads common in large-scale recommendation systems and LLM inference.
- Infrastructure Optimization: Google implemented 'Multi-Slice' training, allowing the orchestration of thousands of TPUs across different data centers to act as a single unified supercomputer, reducing latency for massive model training.
🔮 Future ImplicationsAI analysis grounded in cited sources
Nvidia's data center revenue growth will decelerate below 20% YoY by Q4 2026.
As hyperscalers shift capital expenditure toward internal custom silicon (TPUs, Maia, Trainium), reliance on merchant silicon for inference workloads is structurally declining.
Google Cloud will achieve a 40% operating margin by the end of 2027.
The combination of proprietary hardware efficiency and high-margin software-as-a-service (SaaS) AI agent tools creates a compounding effect on profitability.
⏳ Timeline
2023-12
Google announces Gemini 1.0, marking the start of the unified AI model strategy.
2024-05
Google I/O 2024 introduces the Gemini 1.5 Pro model with a 1-million token context window.
2024-08
Google announces the general availability of TPU v6 'Trillium' for cloud customers.
2025-02
Google Cloud reports first-ever full-year profitability for the cloud division.
2026-04
Google Cloud reports 63% revenue growth, triggering the record-breaking stock rally.
📰
Weekly AI Recap
Read this week's curated digest of top AI events →
👉Related Updates
AI-curated news aggregator. All content rights belong to original publishers.
Original source: 虎嗅 ↗



