Step 3.5 Flash is a 196B MoE model with 11B active params for agentic tasks. Optimized with sliding-window attention and MTP-3 for low-latency inference. Matches frontier models on math, code, and agent benchmarks.
Key Points
- 1.Scalable RL with verifiable signals and preferences
- 2.SOTA on IMO, LiveCodeBench, tau2-Bench
- 3.Ideal for industrial agent deployment
Impact Analysis
Redefines efficiency for deploying advanced agents. Enables high-performance open models rivaling GPT-5.2 and Gemini 3.0.
Technical Details
Interleaved 3:1 attention, multi-token prediction. Stable off-policy RL training.