All Updates
Page 320 of 912
April 3, 2026
Predicting Agent Coding Task Performance
Introduces framework using augmented Item Response Theory (IRT) to predict success/failure on individual agentic coding tasks. Decomposes agent ability into LLM and scaffold components for cross-leaderboard aggregation. Enables predictions for unseen benchmarks and agent combinations, aiding benchmark calibration.
PG-IPRO: Interactive Accessible Route Optimizer
PG-IPRO introduces preference-guided iterative optimization for urban route planning tailored to accessibility needs. Users interact by providing feedback on routes, specifying objectives to minimize or relax. It enables intuitive early iterations and computational efficiency by avoiding full Pareto front computation.
Neurosymbolic Ontology Grounds Enterprise AI Agents
Introduces a neurosymbolic architecture in FAOS using three-layer ontologies to constrain LLMs, reducing hallucinations and ensuring compliance in enterprise settings. Evaluated over 600 runs in five industries, it outperforms baselines on accuracy, compliance, and consistency. Contributions include ontology models, coupling taxonomy, and production deployment for 650+ agents.
Google Vids Adds AI Video & Music Tools
Google has upgraded Google Vids in Workspace with significant new AI tools for video and music creation. This integrates AI video generation into everyday productivity workflows. It contrasts with OpenAI pulling back its Sora model.
Daina Tech Raises $14M for AI Labs
Daina Technology secured nearly 100M RMB B+ funding for AI-driven Black Lamp Labs. Funds target tech upgrades, expansion, and first global new materials CRO platform with Beijing Chemical University. Achieved 100% delivery of fully unmanned labs for AI4S.
CircuitProbe Predicts Transformer Circuits in Minutes
CircuitProbe predicts reasoning circuits in Transformers from activation statistics in under 5 minutes on CPU, a 3-4 orders of magnitude speedup over brute-force methods. It detects stability circuits in early layers via representation change derivatives and magnitude circuits in late layers via anomaly scoring. Validated across 9 models and multilingual, it aids small LLMs under 3B parameters via layer duplication.
BloClaw: Omniscient Agentic Workspace for AI Science
BloClaw is a unified multi-modal OS for AI4Science, overcoming bottlenecks in agent frameworks like fragile JSON tool-calling and poor visualization handling. It introduces XML-Regex routing (0.2% error rate), runtime state interception for dynamic plots, and adaptive UI. Benchmarked on RDKit, ESMFold, docking, and RAG; open-source on GitHub.
Agent Judges Match Humans, Reveal Scaling Laws
LLM-based agent judges match human raters in Turing-style validation across 960 sessions on 15 tasks. Quality scores improve logarithmically with panel size, while unique issue discoveries follow a sublinear power law. Big Five personality conditioning enables diverse probing for better coverage.
Adaptive MCTS Cuts LLM Test-time Latency
Monte Carlo Tree Search (MCTS) boosts LLM reasoning but causes long-tail latency. New negative early exit prunes unproductive paths, while adaptive boosting reallocates compute. Integrated into vLLM, it slashes p99 latency, boosts throughput, and keeps accuracy.
Qwen3.6 Medium Sizes Open Soon
Qwen team announces plans to open-source medium-sized Qwen3.6 models for local deployment and customization. Developers can vote on most anticipated model size via Twitter poll. Shared in r/LocalLLaMA with link to ChujieZheng's post.
MiMo LLM Exceeds 1T Token Calls
Xiaomi CEO Lei Jun announced that the MiMo large model surpassed 1 trillion token calls yesterday. This milestone underscores rapid growth in usage of Xiaomi's AI model. Reported by 36Kr.
Japanese Startup Demos Dancing Humanoid Robot
Tokyo Robotics, a Waseda University spin-off startup, released a demo video of their 'Torobo Humanoid' prototype. The robot showcases bipedal walking and full-body movements including dance, all via remote control. This marks a milestone for domestic Japanese humanoid robotics.
Zhiyuan Starts AI Release Week
Zhiyuan (AGIBOT) announces 'Zhiyuan AI Release Week' starting April 7. It will unlock one core tech breakthrough daily over six days. This unveils their full physical AI capability landscape.
Buffett Avoids AI, Restarts Charity Lunch
Warren Buffett restarted his annual charity lunch auction after a two-year pause, planning to match winning bids to Glide Foundation. He remains bearish on current stock markets, waiting for deeper discounts, and admits he won't invest in AI due to lack of understanding, preferring familiar businesses like Apple as a consumer brand. He also distanced himself from Bill Gates amid Epstein scandal concerns.
SJTU AI College Teams with Ant Health on Med AI Lab
Shanghai Jiao Tong University AI College and Ant Health signed a cooperation agreement to launch AI4HealthCare Joint Lab. Focus areas include R&D of medical specialist AI agents and clinical adaptation applications. Initial results will deploy in Ant Afu App.
Gemma 4 26B Hits 81 Tok/Sec on M5 Max MacBook
Gemma 4 26b a4b model runs at an average of 81 tokens per second on MacBook Pro M5 MAX. Peak power usage is 114 watts during short bursts due to fast responses. This demonstrates high-speed local inference on Apple silicon.
Gemma-4-E2B-IT beats Qwen3.5-4B in speed
Gemma-4-E2B-IT matches or exceeds Qwen3.5-4B performance while offering much shorter average reasoning times. Posted on r/LocalLLaMA with link to discussion.
OPPO K15 Pro Series Launches at 2899 Yuan
OPPO K15 Pro series officially goes on sale today with Dimensity 8500 SUPER in Pro and 9500s in Pro+, starting at 2899 yuan or 2634 yuan with subsidies. Key features include 144Hz/165Hz high-refresh displays, IP69 dust/water resistance, and innovative cooling fans. Unique designs like forged carbon and customizable breathing lights enhance appeal.
Qwen 3.6 Tops Chinese Programming Benchmarks
Global authoritative large model blind test leaderboard released. Alibaba's Qwen 3.6 ranks first as China's strongest programming model. Additional Qwen model series recently launched.
Microsoft's $10B AI Investment in Japan
Microsoft announced a four-year, $10 billion investment in Japan to bolster AI services. This initiative forms part of the company's broader Asia expansion strategy. The move targets a region with strong demand for AI technologies.