AI Updates Aggregator

🏠IT之家•Feb 21, 2026Stalecollected in 3h

Zhipu Apologizes for GLM-5 Rollout Woes

Post LinkedIn

🏠Read original on IT之家

#token-pricing #grayscale-rollout #user-compensationglm-5

💡Zhipu GLM-5 pricing fixes + refunds: optimize your China LLM costs now

⚡ 30-Second TL;DR

What Changed

GLM-5 token costs 2x off-peak/3x peak vs GLM-4.7 due to larger scale targeting Claude Opus level

Why It Matters

Addresses user backlash on China's leading LLM pricing, stabilizing trust amid competition. Compensation may retain users, but highlights scaling pains for frontier models.

What To Do Next

Check Zhipu dashboard and apply for GLM-5 Pro/Lite refund if usage surged unexpectedly.

Who should care:Developers & AI Engineers

🧠 Deep Insight

Web-grounded analysis with 7 cited sources.

🔑 Enhanced Key Takeaways

•Zhipu AI apologized on February 21, 2026, for GLM Coding Plan issues including lack of transparency, slow GLM-5 rollout due to traffic surge, and flawed upgrade mechanisms for old users[1][2].
•GLM-5 rollout is phased: Max tier fully open, Pro tier with peak-hour limits due to high cluster load, Lite tier post-holiday grayscale; refunds offered to affected Lite/Pro users since Jan 1[1][2].
•GLM-5 is 2x larger than GLM-4.7 with 744B total parameters (40B active) in MoE architecture using DeepSeek Sparse Attention, trained on 28.5T tokens, targeting Claude Opus-level coding and agentic performance[3][5][6].
•Token costs increased 2-3x due to model scale; dashboard improvements reduced refresh from 1hr to 10min with rules now on purchase page; one-click rollback for Feb 12-16 mis-upgrades[1].
•Optimized for domestic chips like Huawei Ascend, Moore Threads; compute constraints caused serving delays and pricing hikes amid 10x traffic increase[2][4][6].

📊 Competitor Analysis▸ Show

Model	Parameters	Key Benchmarks	Pricing Notes
Zhipu GLM-5	744B total (40B active, MoE)	Leads open models in coding/agentic; surpasses Gemini 3 Pro, lags Claude Opus	2-3x GLM-4.7 tokens; 30% coding plan hike [3][5]
DeepSeek (recent)	N/A	Sparse Attention pioneer; 10x context expansion	Efficiency-focused [4]
Anthropic Claude Opus	Proprietary	Top coding benchmark	N/A [3]
Kimi K2.5	N/A	Below GLM-5 on GDPVal-AA	Cheap metering [2][5]

🛠️ Technical Deep Dive

• GLM-5: 744 billion total parameters, 40 billion active parameters in Mixture-of-Experts (MoE) architecture; doubled from GLM-4.7's 355B[3][5][6]. • Trained on 28.5 trillion tokens; adopts DeepSeek Sparse Attention for computational efficiency[3][4]. • Supports deployment on non-NVIDIA chips: Huawei Ascend, Moore Threads, Cambricon, Kunlunxin, MetaX via kernel optimization and quantization[4][6]. • Serving challenges: MLA models with one KV head cause tensor parallelism KV cache waste; mitigations like SGLang's DP Attention (DPA) for zero KV redundancy and +92% throughput[2]. • Pivot to 'agentic engineering' from 'vibe coding' for scaled AI-automated coding[3].

🔮 Future ImplicationsAI analysis grounded in cited sources

Zhipu's GLM-5 launch and apology highlight compute bottlenecks in China's AI race, signaling shift to agentic/coding models amid GPU shortages; pricing hikes buck price wars, while domestic chip optimization reduces NVIDIA reliance, potentially accelerating open-weight SOTA competition with global leaders like Claude Opus[2][3][4][5].

⏳ Timeline

2025-12

Zhipu releases GLM-4.7, marketed as coding partner with subscription pivot to coding plans

2026-02-12

GLM-5 launched alongside 30% coding plan price hike; rollout begins with upgrade issues

2026-02-21

Zhipu issues public apology for GLM Coding Plan transparency, rollout delays, and upgrades; announces refunds and improvements

📎 Sources (7)

Factual claims are grounded in the sources below. Forward-looking analysis is AI-generated interpretation.

🏠Read original article on IT之家

📰

Weekly AI Recap

Read this week's curated digest of top AI events →

👉Related Updates

Same topic

Explore #token-pricing

Same product

AI-curated news aggregator. All content rights belong to original publishers.
Original source: IT之家 ↗

⚡ 30-Second TL;DR

🧠 Deep Insight

🔑 Enhanced Key Takeaways

🛠️ Technical Deep Dive

🔮 Future ImplicationsAI analysis grounded in cited sources

⏳ Timeline

📎 Sources (7)

👉Related Updates

Audi E5 Sportback gets Momenta reinforcement learning AI update

SambaNova targets $10B valuation with new funding round

UN approves first global technical regulation for autonomous driving

Nokia 215 4G feature phone adds AI and payments