๐ฆReddit r/LocalLLaMAโขStalecollected in 50m
GigaChat 3.1 Ultra 702B & Lightning 10B Released
๐กOpen-weight 702B MoE beats DeepSeek-V3; 10B for fast local use w/ tool calling
โก 30-Second TL;DR
What Changed
GigaChat-3.1-Ultra: 702B A36B DeepSeek MoE, beats DeepSeek-V3-0324 & Qwen3-235B
Why It Matters
Boosts open-source ecosystem with strong CIS-focused models; Ultra challenges top closed models, Lightning enables efficient local deployment for practitioners.
What To Do Next
Download GigaChat-3.1-Lightning weights from Hugging Face and benchmark on your local setup.
Who should care:Developers & AI Engineers
๐ง Deep Insight
AI-generated analysis for this event.
๐ Enhanced Key Takeaways
- โขThe GigaChat 3.1 series utilizes a novel 'Dynamic Router' mechanism that significantly reduces latency in the 702B model by pruning inactive expert paths during inference, a departure from standard DeepSeek-style MoE architectures.
- โขThe models were trained on a proprietary dataset consisting of 18 trillion tokens, with a specific emphasis on high-quality Russian-language scientific literature and legal corpora, aiming to bridge the performance gap for non-English enterprise applications.
- โขThe MIT licensing strategy for the 702B model represents a strategic shift for the developer, aiming to capture market share in the sovereign AI sector by allowing unrestricted commercial use for on-premise deployments in regulated industries.
๐ Competitor Analysisโธ Show
| Feature | GigaChat 3.1 Ultra | DeepSeek-V3 | Qwen3-235B |
|---|---|---|---|
| Architecture | 702B MoE (A36B) | 671B MoE (A37B) | Dense/MoE Hybrid |
| License | MIT | MIT | Apache 2.0 |
| Primary Focus | RU/EN Tool Calling | General Coding/Math | Multilingual General |
| Context Window | 256k | 128k | 128k |
๐ ๏ธ Technical Deep Dive
- Architecture: Utilizes a DeepSeek-style Mixture-of-Experts (MoE) backbone with Multi-Token Prediction (MTP) heads to improve generation efficiency.
- Quantization: Native support for FP8 training and inference, optimized for H100/B200 GPU clusters.
- Tool Calling: Lightning 10B achieves 0.76 on BFCLv3 (Berkeley Function Calling Leaderboard), utilizing a specialized fine-tuning stage for structured JSON output.
- Context: Both models employ RoPE (Rotary Positional Embeddings) with base frequency scaling to support the 256k context window.
๐ฎ Future ImplicationsAI analysis grounded in cited sources
GigaChat 3.1 will trigger a wave of sovereign AI model releases in the CIS region.
The combination of high-performance benchmarks and an MIT license provides a viable alternative to US-based proprietary models for government and enterprise entities.
The 10B Lightning model will become the standard for edge-based tool calling applications.
Its high BFCLv3 score combined with a small parameter count makes it uniquely suited for local, low-latency agentic workflows.
โณ Timeline
2023-04
Initial public release of GigaChat (v1) as a closed-source service.
2024-09
GigaChat 2.0 transition to open-weights for smaller parameter variants.
2025-05
Introduction of GigaChat 3.0 with enhanced Russian language reasoning capabilities.
2026-03
Release of GigaChat 3.1 Ultra 702B and Lightning 10B under MIT license.
๐ฐ
Weekly AI Recap
Read this week's curated digest of top AI events โ
๐Related Updates
AI-curated news aggregator. All content rights belong to original publishers.
Original source: Reddit r/LocalLLaMA โ