๐ญ๐ฐSCMP TechnologyโขFreshcollected in 3m
China Races to Build 10K-Card AI Clusters

๐กChina's 10K-card clusters slash AI training timesโvital for large-scale model scaling.
โก 30-Second TL;DR
What Changed
China building 10,000+ AI accelerator chip clusters as infrastructure
Why It Matters
This infrastructure push bolsters China's AI competitiveness, potentially offering scalable, cost-effective training resources globally. AI practitioners gain access to massive compute at competitive prices from Chinese providers.
What To Do Next
Assess Huawei Cloud or Alibaba Cloud for 10K-scale AI training availability.
Who should care:Enterprise & Security Teams
๐ง Deep Insight
AI-generated analysis for this event.
๐ Enhanced Key Takeaways
- โขThe push for 10,000-card clusters is largely driven by US export controls on high-end GPUs, forcing Chinese firms to optimize interconnect technologies like Huawei's Ascend series and proprietary high-speed fabrics to compensate for lower individual chip performance.
- โขLocal governments in regions like Beijing, Shanghai, and Shenzhen are providing heavy subsidies and land grants for 'Intelligent Computing Centers' (ICCs) to standardize these clusters as public utility infrastructure rather than private assets.
- โขThe primary bottleneck for these massive clusters is not just chip count, but the 'interconnect wall'โthe latency and bandwidth limitations of domestic networking hardware compared to NVIDIA's NVLink/InfiniBand ecosystem.
๐ Competitor Analysisโธ Show
| Feature | Huawei Ascend Cluster | Alibaba PAI Cluster | NVIDIA DGX SuperPOD (US) |
|---|---|---|---|
| Interconnect | Ascend Fabric (Proprietary) | RoCE v2 / Custom | NVLink / InfiniBand |
| Primary Chip | Ascend 910B/C | H800/A800 (Legacy) | H100/B200 |
| Software Stack | CANN / MindSpore | PAI / MaxCompute | CUDA / NCCL |
๐ ๏ธ Technical Deep Dive
- โขCluster Architecture: Utilizes a hierarchical topology (Leaf-Spine) to manage traffic between thousands of nodes, often employing RDMA over Converged Ethernet (RoCE v2) to minimize CPU overhead.
- โขInterconnect Fabric: Huawei's proprietary HCCS (Huawei Cluster Communication System) is used to achieve high-bandwidth, low-latency communication between Ascend chips, attempting to mimic the performance characteristics of NVLink.
- โขMemory Management: Implementation of distributed memory architectures to handle massive parameter counts for Large Language Models (LLMs), utilizing model parallelism (tensor and pipeline) to distribute workloads across the 10,000+ card array.
- โขCooling Infrastructure: Transitioning to liquid cooling solutions (cold plate technology) to manage the thermal density of high-density racks required for 10,000-card deployments.
๐ฎ Future ImplicationsAI analysis grounded in cited sources
Domestic AI training costs in China will remain 20-30% higher than global averages by 2027.
The inefficiency of domestic interconnect fabrics compared to NVIDIA's ecosystem requires more hardware resources to achieve the same effective training throughput.
Huawei will capture over 50% of the domestic AI accelerator market share by end of 2026.
As the primary alternative to restricted Western chips, Huawei's vertical integration of hardware and software is becoming the de facto standard for Chinese state-backed AI infrastructure.
โณ Timeline
2023-08
Huawei releases the Ascend 910B, signaling a viable domestic alternative for large-scale training.
2024-03
Chinese government officially promotes 'Intelligent Computing Centers' as a key pillar of the 'New Quality Productive Forces' policy.
2025-06
Major Chinese tech firms begin mass-scale deployment of 10,000-card clusters to bypass ongoing US chip restrictions.
๐ฐ
Weekly AI Recap
Read this week's curated digest of top AI events โ
๐Related Updates
AI-curated news aggregator. All content rights belong to original publishers.
Original source: SCMP Technology โ
