MiniMax M2.7 Model Leaked Online

๐กLeak reveals potential new MiniMax modelโgrab previews before official drop (r/LocalLLaMA)
โก 30-Second TL;DR
What Changed
Leaked on DesignArena platform
Why It Matters
This leak could preview upcoming MiniMax capabilities, exciting local LLM enthusiasts. Early access might spur community fine-tunes before official release.
What To Do Next
Check DesignArena for MiniMax M2.7 previews and monitor r/LocalLLaMA for downloads.
๐ง Deep Insight
Web-grounded analysis with 8 cited sources.
๐ Enhanced Key Takeaways
- โขMiniMax M2 is an open-source MoE model with 230B total parameters and 10B active parameters at inference, optimized for coding and agentic workflows.[1][3][6]
- โขIt excels in elite coding, debugging multi-file repositories, agentic toolchains, and handwritten OCR, outperforming many models in community tests.[3]
- โขM2 powers MiniMax Agent with Lightning Mode for fast tasks and Pro Mode for complex workflows like research and development, currently offered free.[2]
๐ Competitor Analysisโธ Show
| Feature | MiniMax M2 | Claude Sonnet 4.5 |
|---|---|---|
| Active Parameters | 10B (230B total MoE) | Not specified[1] |
| Inference Speed | ~100 t/s (or 48.2 t/s measured)[1][5] | ~50 t/s (half of M2)[1] |
| Pricing | $0.255/M input, $1/M output[6] | Not directly compared[1] |
| Benchmarks | Strong on SWE-Bench, Multi-SWE-Bench, Terminal-Bench, GAIA[6] | Competitive in programming, tool use[2] |
๐ ๏ธ Technical Deep Dive
- โขMixture-of-Experts (MoE) architecture: 230 billion total parameters, 10 billion active per inference for efficiency.[1][3][4][6]
- โขContext length: 200k-205k tokens; max output: 128k tokens including chain-of-thought.[4][5][6]
- โขInference speed: ~100 tokens/second claimed, 48.2 t/s measured; supports vLLM/SGLang deployment on consumer hardware.[1][5][8]
- โขCapabilities: Polyglot code mastery, function calling, advanced reasoning, multimodal agent support (text/video/audio/image).[1][2][4]
- โขDeployment: Runs on 96G x4 GPUs (400K KV cache), up to 144G x8 GPUs (3M tokens).[8]
๐ฎ Future ImplicationsAI analysis grounded in cited sources
โณ Timeline
๐ Sources (8)
Factual claims are grounded in the sources below. Forward-looking analysis is AI-generated interpretation.
Weekly AI Recap
Read this week's curated digest of top AI events โ
๐Related Updates
AI-curated news aggregator. All content rights belong to original publishers.
Original source: Reddit r/LocalLLaMA โ