🧠Stalecollected in 17m

Alibaba Open-Sources 3 Qwen3.5 Models for Consumer GPUs

Alibaba Open-Sources 3 Qwen3.5 Models for Consumer GPUs
PostLinkedIn
🧠Read original on 机器之心

💡New open-weight Qwen3.5 beats GPT-5 mini, runs on consumer GPUs—deploy locally now

⚡ 30-Second TL;DR

What Changed

Open-sourced Qwen3.5-35B-A3B, Qwen3.5-122B-A10B, Qwen3.5-27B models

Why It Matters

Enables affordable local deployment of top-tier LLMs, lowering barriers for developers and boosting open-source competition against closed models. Alibaba Cloud's cheap API accelerates enterprise adoption.

What To Do Next

Download Qwen3.5-27B from Hugging Face and test on your consumer GPU for agent tasks.

Who should care:Developers & AI Engineers

🧠 Deep Insight

Web-grounded analysis with 9 cited sources.

🔑 Enhanced Key Takeaways

  • Qwen3.5-397B-A17B achieves 19x faster decoding on long-context tasks (256k tokens) compared to Qwen3-Max while matching its reasoning and coding performance[7].
  • Qwen3.5 excels in agentic terminal coding with 52.5 on Terminal-Bench 2.0, surpassing Qwen3-Max (22.5) and approaching Gemini 3 Pro (54.2)[7].
  • In document recognition, Qwen3.5 scores 90.8 on OmniDocBench v1.5, outperforming GPT-5.2 (85.7) and Claude Opus 4.5 (87.7)[7].
📊 Competitor Analysis▸ Show
FeatureQwen3.5-Flash/Qwen3 VLGPT-5 Mini
Context Window262k tokens400k tokens[1][2][3]
Input Cost0.2 yuan (~$0.028)/M tokens (Flash); cheaper overall~$0.25/M tokens[2][5][6]
Output CostNot specified; generally lower~$2/M tokens[2][5][6]
Coding Benchmarks16.5 (Qwen3 VL); strong in Terminal-Bench35.3; leads in LiveCodeBench (83.8)[1][7]
Speed (tok/s)44.4 (Qwen3 VL)130.7[1]
Intelligence Index20.6 (Qwen3 VL)41.0[1]

🔮 Future ImplicationsAI analysis grounded in cited sources

Qwen3.5 series will capture >20% more open-source agentic AI deployments by mid-2026
Its superior Terminal-Bench performance and consumer GPU compatibility lower barriers for developers building autonomous agents compared to proprietary rivals[7].
Alibaba Cloud API pricing undercuts OpenAI by 80%+ on input tokens
At 0.2 yuan per million (~$0.028 USD), Qwen3.5-Flash offers massive cost savings over GPT-5 Mini's $0.25, driving high-volume enterprise adoption[1][2].

Timeline

2025-07
Qwen3 Coder 480B A35B released as specialized coding model
2025-08
OpenAI launches GPT-5 Mini with 400k context window
2025-12
Qwen3.5 series announced with major speed and benchmark gains
2026-02
Alibaba open-sources Qwen3.5-35B-A3B, 122B-A10B, and 27B for consumer GPUs
📰

Weekly AI Recap

Read this week's curated digest of top AI events →

👉Related Updates

AI-curated news aggregator. All content rights belong to original publishers.
Original source: 机器之心