AI Updates Aggregator

🔥36氪•Mar 22, 2026Stalecollected in 1m

China's AI Models Hit 4.69T Weekly Tokens

Post LinkedIn

🔥Read original on 36氪

#token-usage #china-lead #inference-growthchinese-llms

💡China tops global AI usage 2nd week—370x growth forecast signals market shift.

⚡ 30-Second TL;DR

What Changed

Chinese AI models hit 4.69T weekly Tokens on OpenRouter as of Mar 15

Why It Matters

Highlights China's surging AI infrastructure lead, pressuring Western dominance. AI practitioners may need to integrate Chinese models for scale and cost advantages amid explosive growth.

What To Do Next

Check OpenRouter rankings to compare your model's Token usage against Chinese leaders.

Who should care:Founders & Product Leaders

🧠 Deep Insight

AI-generated analysis for this event.

🔑 Enhanced Key Takeaways

•The surge in token usage is largely attributed to the aggressive pricing strategies of Chinese AI labs, which have triggered a 'price war' by offering significantly lower costs per million tokens compared to US-based frontier models.
•OpenRouter's data reflects a shift in developer preference toward Chinese models for high-volume, cost-sensitive inference tasks, particularly among developers utilizing the platform's API aggregation services.
•The 4.69 trillion token figure highlights a rapid adoption of Chinese models like DeepSeek and Qwen, which have gained traction in international developer communities due to their competitive performance-to-price ratios.

📊 Competitor Analysis▸ Show

Feature	Chinese Frontier Models (e.g., DeepSeek/Qwen)	US Frontier Models (e.g., GPT-4o/Claude 3.5)
Pricing	Highly aggressive; often <$0.50/1M tokens	Premium; typically $5-$15/1M tokens
Primary Advantage	Cost-efficiency and high throughput	Reasoning capabilities and ecosystem integration
Benchmarks	Competitive on coding/math; catching up on general reasoning	Industry standard for complex reasoning and multimodal tasks

🛠️ Technical Deep Dive

•Chinese models driving this volume often utilize Mixture-of-Experts (MoE) architectures to optimize inference costs while maintaining high parameter counts.
•Increased token throughput is supported by advancements in hardware utilization, specifically the optimization of inference kernels for domestic and imported GPU clusters to handle high concurrent request volumes.
•The high token count is partially driven by the integration of these models into automated agentic workflows, which generate significantly higher token volumes per user request compared to traditional chat interfaces.

🔮 Future ImplicationsAI analysis grounded in cited sources

Global AI inference pricing will experience sustained downward pressure.

The success of low-cost Chinese models on platforms like OpenRouter forces US providers to either lower prices or differentiate through proprietary features to maintain market share.

Chinese AI labs will increase focus on international API infrastructure.

To sustain the observed token growth, these companies must invest in global edge computing and latency reduction to serve non-domestic developers effectively.

⏳ Timeline

2024-01

DeepSeek releases DeepSeek-V2, marking a shift toward high-efficiency MoE models.

2025-02

Chinese AI labs initiate significant price cuts on API services to capture developer market share.

2026-03

OpenRouter reports Chinese models surpassing US models in weekly token volume for two consecutive weeks.

🔥Read original article on 36氪

📰

Weekly AI Recap

Read this week's curated digest of top AI events →

👉Related Updates

Same topic

Explore #token-usage

Same product

AI-curated news aggregator. All content rights belong to original publishers.
Original source: 36氪 ↗