New Bartowski Quants for Qwen3.5 27B

💡450 t/s on 5060 Ti makes Qwen3.5 27B viable for local coding/debugging

⚡ 30-Second TL;DR

What Changed

Bartowski released new Imatrix quants for Qwen3.5 27B

Why It Matters

Enhances local deployment of large models on consumer NVIDIA GPUs, enabling faster inference for coding tasks without cloud dependency.

What To Do Next

Download Bartowski's IQ2_M quant from Hugging Face and benchmark on your 40-series GPU.

Who should care:Developers & AI Engineers

Web-grounded analysis with 7 cited sources.

•Qwen3.5-27B is a native vision-language model supporting text, image, and video inputs with a 262,144-token context length[1][2][3].
•Model released on February 24, 2026, by Alibaba's Qwen team alongside Qwen3.5-122B-A10B and Qwen3.5-35B-A3B variants[5].
•Hosted API pricing is $0.30 per 1M input tokens and $2.40 per 1M output tokens[2][3].

•Native vision-language dense model with linear attention mechanism for fast response times and balanced inference speed/performance[1][2][3].
•Overall capabilities comparable to larger Qwen3.5-122B-A10B model[1][2][3].
•Uses Qwen3 tokenizer; 27 billion parameters; modalities: text + image + video → text output[2].
•Benchmark scores include GPQA 85.8%, HLE 22.2%, SciCode 39.5%, LCR 67.3%, IFBench 75.6%, Tau2 93.9%, TerminalBench Hard 32.6%[3].

Quantized Qwen3.5-27B will expand local deployment on consumer GPUs

Bartowski's Imatrix quants enable high-speed inference (450 t/s pp) on RTX 5060 Ti, making advanced vision-language capabilities accessible offline.

Qwen3.5 series will gain traction in agentic and reasoning tasks

Strong benchmarks in GPQA, IFBench, and tool-use support position it as a competitive open-weight model for practical applications.

2026-02

Qwen3.5-27B released by Alibaba Qwen team

2026-02-24

Official release of Qwen3.5-27B alongside 122B-A10B and 35B-A3B

2026-02-25

Qwen3.5-27B added to model directories like Writingmate

2026-02-27

Performance reviews and benchmarks published

2026-03-02

Bartowski releases Imatrix quantized versions for Qwen3.5-27B

Factual claims are grounded in the sources below. Forward-looking analysis is AI-generated interpretation.

Weekly AI Recap

Read this week's curated digest of top AI events →

Same topic

Explore #quantization

Same product