⚛️Stalecollected in 19m

Open-Source 30B Model Rivals Gemini & Claude

Open-Source 30B Model Rivals Gemini & Claude
PostLinkedIn
⚛️Read original on 量子位

💡Open 30B model beats Gemini/Claude on reasoning loop – benchmark it free today!

⚡ 30-Second TL;DR

What Changed

Open-source 30B model from research AI launched

Why It Matters

Democratizes high-performance reasoning models, enabling researchers to rival proprietary giants without high costs.

What To Do Next

Download the 30B model and test its hypothesis-verification loop on arXiv reasoning benchmarks.

Who should care:Researchers & Academics

🧠 Deep Insight

Web-grounded analysis with 10 cited sources.

🔑 Enhanced Key Takeaways

  • Qwen3 30B A3B, a likely candidate for the model, was released on April 28, 2025, by the Qwen team with a 41K token context window and very low pricing of $0.08/M input and $0.28/M output tokens.[5]
  • DeepSeek V3 is highlighted as a leading open-source model in 2026 comparisons for its performance-to-price ratio, free/low-cost API access, and strengths in code and reasoning tasks.[1][6]
  • No specific 30B model from 'Research AI' with hypothesis-evidence-verification loop appears in major 2026 AI model leaderboards or comparisons, suggesting limited adoption or recognition beyond initial announcement.[1][2][4]
📊 Competitor Analysis▸ Show
ModelParametersContext WindowPricing (input/output per M tokens)Key Benchmarks
Qwen3 30B A3B30B41K tokens$0.08 / $0.28Advanced reasoning, structured data generation[5]
Claude Sonnet 4.6Undisclosed1M tokens$3.00 / $15.00Leads GDPval-AA (1,606 Elo), ARC-AGI-2 68.8%[4][5]
Gemini 3.1 ProUndisclosedUndisclosed$2.00 / $12.0077.1% ARC-AGI-2, 94.3% GPQA Diamond[4]
DeepSeek V3Undisclosed128K tokensFree/low-cost APIStrong in code, reasoning[1][6]

🔮 Future ImplicationsAI analysis grounded in cited sources

Open-source 30B models like Qwen3 will capture >20% more developer market share by end-2026
Their drastically lower costs (37.5x cheaper input than Claude Sonnet 4.6) enable widespread custom deployments in coding and agentic tasks.[5]
Hypothesis-evidence-verification loops in open models will standardize in scientific benchmarks by Q1 2027
Emerging open-source reasoning models already outperform closed ones like Gemini 2.5 Flash on math tasks such as AIME 2025.[9]

Timeline

2025-04
Qwen releases Qwen3 30B A3B, an open-source 30B model with advanced reasoning capabilities.[5]
2026-02
Anthropic launches Claude Sonnet 4.6 and Opus 4.6, setting new benchmarks in reasoning and code.[4][5]
2026-02
Google releases Gemini 3.1 Pro, leading 13/16 benchmarks including ARC-AGI-2 at 77.1%.[4]
📰

Weekly AI Recap

Read this week's curated digest of top AI events →

👉Related Updates

AI-curated news aggregator. All content rights belong to original publishers.
Original source: 量子位