AdaReasoner, a 7B model accepted to ICLR 2026, beats GPT-5 on visual puzzles via dynamic tool orchestration in Agentic Vision. Learns what, when, and how to use visual tools iteratively, mimicking human investigation.
Key Points
- 1.Dynamic visual tool use
- 2.7B tops large models
- 3.Agentic vision loop
Technical Details
Integrates thinking-action-observation; open-source code, models on Hugging Face.
