๐Ÿฆ™Stalecollected in 42m

Unsloth Fixes Qwen3.5-35B Tool Calling

PostLinkedIn
๐Ÿฆ™Read original on Reddit r/LocalLLaMA

๐Ÿ’กFixed Qwen3.5-35B crushes research tasks locally, beats cloud giants

โšก 30-Second TL;DR

What Changed

Unsloth releases fixed GGUF quants resolving tool calling bugs

Why It Matters

Empowers local AI practitioners with a high-parameter model for advanced research without cloud dependency. Boosts open-source LLM viability for complex tool-using workflows.

What To Do Next

Download unsloth/Qwen3.5-35B-A3B-GGUF from Hugging Face and test tool calling via llama.cpp.

Who should care:Researchers & Academics

๐Ÿง  Deep Insight

Web-grounded analysis with 6 cited sources.

๐Ÿ”‘ Enhanced Key Takeaways

  • โ€ขUnsloth's Dynamic quants for Qwen3.5-35B achieve state-of-the-art (SOTA) performance on nearly all bit widths, validated by over 150 KL Divergence benchmarks across 9TB of GGUFs[1][2].
  • โ€ขThe tool-calling fix addresses a chat template bug universally, impacting all Qwen3.5 formats and uploaders, not limited to Unsloth's GGUF quants[1][2].
  • โ€ขUnsloth is retiring MXFP4 layers from specific Qwen3.5 GGUFs (Q2_K_XL, Q3_K_XL, Q4_K_XL) due to benchmark findings[1].

๐Ÿ› ๏ธ Technical Deep Dive

  • โ€ขUnsloth Dynamic 2.0 GGUFs for Qwen3.5-35B show 99.9% KL Divergence on the Pareto Frontier, outperforming alternatives in accuracy and size (e.g., dynamic 4bit version is 2GB smaller with +1% accuracy vs QAT)[2][3].
  • โ€ขOptimal quantization targets ffn_up_exps, ffn_gate_exps at 3bit (e.g., iq3_xxs) for best disk space and KLD balance; avoid heavy quantization on ssm_out due to high KLD increase[2].
  • โ€ขQwen3.5-35B-A3B recommended over 27B for faster inference when hardware fits, while 27B prioritizes slight accuracy gains[1].

๐Ÿ”ฎ Future ImplicationsAI analysis grounded in cited sources

Unsloth Dynamic quants will dominate low-bit local inference for Qwen3.5 models.
Benchmarks demonstrate SOTA KL Divergence across bits with uploaded 9TB artifacts confirming superior accuracy-efficiency tradeoffs[2].
Tool-calling reliability will improve across all Qwen3.5 deployments.
Universal chat template fix applies to any format or uploader, resolving a widespread bug reported in community tooling[1][2].

โณ Timeline

2026-02
Qwen3.5 model series released by Alibaba
2026-02-27
Unsloth Dynamic 2.0 GGUFs updated with Qwen3.5 support and initial tool-calling fixes
2026-03-03
Unsloth releases updated Qwen3.5-35B-A3B GGUFs with finalized tool-calling fixes and SOTA benchmarks
๐Ÿ“ฐ

Weekly AI Recap

Read this week's curated digest of top AI events โ†’

๐Ÿ‘‰Related Updates

AI-curated news aggregator. All content rights belong to original publishers.
Original source: Reddit r/LocalLLaMA โ†—