Strix Halo Benchmarks New LLMs

Post LinkedIn

🦙Read original on Reddit r/LocalLLaMA

#llama-cpp #rocm-benchmarks #edge-inferencestrix-halo

💡Strix Halo speeds for MiniMax M2.5 & Qwen3-Coder-Next—pick top quants for edge AI

⚡ 30-Second TL;DR

What Changed

llama.cpp benchmarks for Minimax M2.5, Step 3.5 Flash quants.

Why It Matters

Helps AI builders select optimal low-memory models for Strix Halo edge inference. Demonstrates recent model improvements on AMD hardware.

What To Do Next

Check llama.cpp benchmarks on GitHub for Strix Halo to pick best Qwen3-Coder-Next quant.

Who should care:Developers & AI Engineers

🧠 Deep Insight

Web-grounded analysis with 7 cited sources.

🔑 Enhanced Key Takeaways

•Strix Halo, or Ryzen AI Max+ 395, features 16 Zen 5 CPU cores, 32 threads, Radeon 8060S iGPU with 40 RDNA 3+ Compute Units, and up to 120 TOPS total AI performance, enabling strong local LLM inference with 128GB RAM support[1][3][4].
•Benchmarks on Reddit demonstrate llama.cpp running quantized LLMs like Minimax M2.5, Step 3.5 Flash, Qwen3-Coder-Next, GLM 4.6V/4.7 Flash, and GPT-OSS-120B at 30k context on Ryzen AI Max+ 395 at 70W using ROCm 7.2[article].
•The processor excels in multi-threaded tasks, outperforming NVIDIA DGX Spark's Arm CPU by 11% in Geekbench 6 multi-threaded scores, suitable for AI and general computing[1].
•Configurable TDPs from 45W to 120W (e.g., Balanced ~85W, Max ~120W, Quiet ~55W) make it versatile for compact workstations like Corsair AI Workstation 300 and mini PCs[1][4].
•Strix Halo debuted around early 2026, popular for mini PCs, high-end laptops, and gaming handhelds due to iGPU rivaling RTX 4060 Laptop GPU performance[1][2][3][4].

📊 Competitor Analysis▸ Show

Feature	Ryzen AI Max+ 395 (Strix Halo)	Intel Core Ultra 5 358H (Panther Lake)	Ryzen AI 9 HX 370 (Strix Point)	NVIDIA DGX Spark
Cores/Threads	16 Zen 5 / 32	Not specified (Xe3 iGPU, next-gen NPU)	12 Zen 5 / Not specified	10P+10E Arm cores
iGPU	Radeon 8060S (40 CU RDNA 3+) @ up to 2900 MHz	Xe3 Battlemage	Radeon 890M	Not specified
TDP (tested)	30W-120W (70W in LLM benchmarks)	30W (gaming test)	Not specified	Not specified
Benchmarks	11% faster multi-threaded Geekbench vs DGX Spark; strong LLM inference[1][article]	Competitive 1080p gaming at 30W[2]	~10% lower CPU perf than Max+[3]	Trails in multi-threaded[1]
Pricing	Compact systems pricey vs performance[1]	Not specified	Not specified	Not specified

🛠️ Technical Deep Dive

•16 full Zen 5 cores (no Zen 5c), up to 5.0 GHz boost, 16% IPC uplift over Zen 4 via branch prediction and refinements; supports AVX-512 for AI/scientific workloads[1][3][4].
•Radeon 8060S iGPU: 40 Compute Units RDNA 3+, clocked at 2900 MHz (future Gorgon Halo refresh to 3+ GHz), rivals discrete RTX 4060 Laptop GPU in gaming/AI[2][3][5].
•XDNA 2 NPU at 50 TOPS, total system AI up to 120 TOPS; LPDDR5x-8000 RAM support, PCIe 4, USB4; large L3 cache[3][4].
•ROCm 7.2 enables llama.cpp LLM inference with 30k context on 128GB RAM pools; configurable power profiles via firmware (45-120W TDP)[1][article].
•Tested in Ubuntu 24.04 for Geekbench; strong in parallel tasks like code compilation, content creation[1].

🔮 Future ImplicationsAI analysis grounded in cited sources

Strix Halo benchmarks highlight advancing local LLM capabilities on APUs, enabling compact, power-efficient AI workstations that challenge discrete GPU needs and compete with Arm-based systems in multi-threaded AI/general computing, potentially driving mini PC and handheld adoption amid rising local AI demand.

⏳ Timeline

2026-01

AMD Ryzen AI Max+ 392 (Strix Halo family) debuts with 12 Zen 5 cores, 40 CU Radeon 8060S iGPU

2026-01

Ryzen AI Max+ 395 (Strix Halo flagship) introduced as high-end APU for mini PCs and laptops

2026-02

Community benchmarks on Reddit showcase Strix Halo LLM performance via llama.cpp on ROCm 7.2

📎 Sources (7)

Factual claims are grounded in the sources below. Forward-looking analysis is AI-generated interpretation.

🦙Read original article on Reddit r/LocalLLaMA

📰

Weekly AI Recap

Read this week's curated digest of top AI events →

👉Related Updates

Same topic

Explore #llama-cpp

Same product