All Updates
Page 331 of 911
April 2, 2026
OpenTools: Community Framework for Reliable AI Agents
OpenTools is a community-driven toolbox that standardizes tool schemas, offers plug-and-play wrappers, and evaluates tools via automated tests and monitoring. It tackles both tool-use accuracy and intrinsic tool reliability for LLMs. Experiments demonstrate 6%-22% performance gains over existing toolboxes across agent architectures.
Independent GPT-OSS-20B Benchmark Reproduction
Researchers reverse-engineered gpt-oss-20b's in-distribution tools by prompting without definitions, confirming high-confidence tool calls. They built a native harmony agent harness on GitHub, bypassing Chat Completions API losses. This achieved the first independent reproduction of OpenAI's scores: 60.4% SWE HIGH, 53.3% MEDIUM, 91.7% AIME25.
HITL Curbs LLM Objective Drift in CS Education
LLMs in CS education cause objective drift, where plausible outputs diverge from specs. Paper proposes human-in-the-loop (HITL) pedagogy using control theory for stable AI use. Pilot curriculum teaches planning, criteria before code gen, with study comparing methods.
EVOM: Execution-Verified RL for Optimization
EVOM is a new framework using execution-verified reinforcement learning to automate optimization modeling with LLMs. It treats solvers like Gurobi as interactive verifiers, generating code, executing in sandboxes, and using outcomes for scalar rewards optimized via GRPO/DAPO. It outperforms process-supervised SFT, supports zero-shot solver transfer across Gurobi, OR-Tools, and COPT.
E-STEER: Emotion Steering in LLMs
Researchers introduce E-STEER, an interpretable framework embedding emotions as controllable variables in LLM hidden states. It examines emotions' mechanistic impact on reasoning, generation, safety, and agent behaviors. Findings reveal non-monotonic effects aligning with psychology, with emotions boosting capabilities and safety.
Decision-Centric Design for LLM Systems
Proposes a decision-centric framework that separates decision signals from action policies in LLM systems, making control explicit and inspectable. This enables failure attribution to specific components and modular improvements. Experiments demonstrate reduced futile actions, higher task success, and interpretable failure modes.
Connections: AI Social Intelligence Benchmark
Researchers introduce Connections, an improvisational wordplay game, as a benchmark for evaluating AI agents' social intelligence. It tests skills in knowledge retrieval, summarization, and awareness of other agents' cognitive states beyond solo reasoning. The game emphasizes collaboration and social awareness in constrained multi-agent environments.
Collaborative AI Agents for Network Fault Detection
Researchers introduce algorithms for collaborative AI agents and critics in federated multi-agent systems using ML or foundation models. They tackle multimodal tasks like network telemetry fault detection via critic feedback without direct inter-agent communication, minimizing system costs. Convergence guarantees are provided using multi-time scale stochastic approximation, with low O(m) communication overhead.
CAMP: Adaptive Multi-Agent Clinical Prediction
CAMP dynamically assembles specialist panels tailored to each case's diagnostic uncertainty in clinical prediction. Specialists use three-valued voting (KEEP/REFUSE/NEUTRAL) for principled abstention, with a hybrid router managing consensus, fallback, or argument-based arbitration. It outperforms baselines on MIMIC-IV across four LLMs while using fewer tokens.
Fliggy-Qianwen Add 30+ Travel AI Partners
Fliggy and Qianwen announced AI partnerships with over 30 more travel brands, totaling over 80 global partners. Fliggy data shows Qingming holiday ticket bookings surged 70% YoY.
Nvidia AI Factories Deployed in China
Nvidia CEO Jensen Huang's frequently mentioned 'AI factories' have now launched in China. This introduces a new paradigm in compute competition. It may enable scalable deployment of AI agents.
Doubao Hits 120T Daily Tokens; Seedance API Beta
Volcano Engine launched Seedance 2.0 API public beta for enterprises. Doubao model's daily Token usage surpassed 120 trillion by March, doubling in 3 months and up 1000x since May 2024. Enterprises using >1T tokens on platform grew to 140 from 100.
Qwen Code v0.14.0-preview.5: New Channels & Cron Features
Qwen Code releases v0.14.0-preview.5, featuring an extensible Channels platform with Telegram, WeChat, and DingTalk support, plus in-session cron scheduling for loops. It includes cross-provider model selection for subagents, npm registry for extensions, and fixes for stability issues like PTY leaks and orphan processes. Numerous UI, CLI, and VSCode enhancements improve reliability.
Longxia Slashes AI Scheduling Costs 58%
Tsinghua University, Renmin University, and Mianbi released Longxia, an open-source intelligent scheduler reducing costs by 58%. It prioritizes data privacy, ensuring sensitive data stays in-house. This tool targets efficient AI workload management.
Breakout AI-Essential Indie Game Success Story
A studio that secured millions in funding created the first breakout AI-native standalone game. Unlike games with AI features, this title cannot exist without AI. The producer stresses that 'fun' remains the eternal core proposition.
Wanxing Juchang Debuts Full Seedance 2.0 Powers
Wanxing Tech's Wanxing Juchang platform relaunched on April 2 with full Seedance 2.0 model capabilities. It enables industrial workflows for AI real-person dramas, 2D/3D animations, and image-to-video, using multimodal inputs for director-level control. Features include auto plot completion, lip-sync, consistent characters/scenes, and batch 2K HD outputs.
AVIC Jonhon Restructures Liquid Cooling for AI
AVIC Jonhon will adjust its liquid cooling division in early 2026 to focus on civilian business amid surging data center and AI computing demand. The new unit handles full value chain from R&D to marketing. Defense liquid cooling spins off to a new ring control department.
DigClaw Raises Angel Funding for Talent Hunt
DigClaw, targeting top geniuses before company registration, secured angel funding from Zhongke Chuangxing and Zhongguancun Capital. The firm maps decision chains behind enterprises, not just names. This funds early AI talent scouting.
DeskClaw Integrates Seedance 2.0 for Videos
NoDesk AI's DeskClaw releases a new version integrating Seedance 2.0, becoming the first Claw product for e-commerce workflows. Users upload product images and a description to auto-generate short videos with storyboards, voiceovers, and scene transitions.
OpenClaw Launches ByteDance-Backed China Mirror
OpenClaw announced an official China mirror for ClawHub on April 1, enabling faster and more stable skill searches for Chinese users. The localized site is built atop ClawHub with infrastructure support from ByteDance.