All Updates
Page 403 of 879
March 25, 2026
Survey: Static to Dynamic LLM Agent Workflows
This arXiv survey reviews workflow optimization methods for LLM-based agents, modeled as agentic computation graphs (ACGs). It categorizes approaches by when structure is determined (static vs. dynamic), optimization targets, and evaluation signals like task metrics or traces. It introduces structure-aware evaluation focusing on graph properties, cost, and robustness.
STEM Agent: Adaptive Multi-Protocol AI Architecture
STEM Agent is a modular AI agent architecture inspired by biological pluripotency, enabling self-adaptation to diverse protocols, tools, and user models. It unifies five interoperability protocols behind a single gateway and features a Caller Profiler that learns user preferences across 20+ dimensions. The framework includes biologically-inspired skills acquisition and efficient memory consolidation, validated by a 413-test suite.
SRM: Temporal Safety Gates for AI Agents
Session Risk Memory (SRM) extends stateless pre-execution safety gates with trajectory-level authorization to detect distributed attacks across agent sessions. It tracks a semantic centroid and risk via exponential moving average on gate outputs, requiring no extra training. ILION+SRM hits perfect F1=1.0000 and 0% FPR on benchmarks with <250ΞΌs overhead.
Neural Net Masters Intuition-Deliberation Split
A bounded dual-path neural architecture separates intuition and deliberation pathways for syllogistic reasoning on a 64-item benchmark. Deliberation achieves r=0.8152 correlation vs. intuition's r=0.7272, with significant gains on rejection responses and specific conclusions. Interpretability reveals sparse, differentiated internal states consistent with reasoning-like organization.
Memory Bear: Multimodal Affective Memory Engine
Memory Bear AI introduces a memory-centered framework for multimodal affective intelligence, treating emotions as evolving variables in a structured memory system. Multimodal signals are converted into Emotion Memory Units (EMUs) for persistent storage, retrieval, and updating across interactions. It outperforms baselines in accuracy and robustness, especially under noisy or missing data.
MaxEnt Scales Synthetic Populations Beyond Raking
New maximum-entropy relaxation matches multi-way cardinality constraints in expectation for synthetic population generation. It formulates an exponential-family distribution solved via convex optimization over Lagrange multipliers. Outperforms generalized raking on NPORS benchmarks with 4-40 attributes and ternary interactions.
LLM Performance Crashes in Multi-Instance Tasks
LLMs excel at single tasks but degrade in multi-instance processing like aggregating sentiments from multiple reviews. Performance dips slightly for 20-100 instances, then collapses at larger scales. Instance count impacts results more than context length.
Intelligence Inertia Physics for AI Costs
This arXiv paper introduces 'intelligence inertia' as a physical principle explaining super-linear computational costs in reconfiguring intelligent systems, rooted in non-commutativity of rules and states. It derives a Lorentz-like non-linear cost formula revealing a 'computational wall' ignored by classical models. Validation comes via three experiments: J-curve comparison, neural architecture trajectories, and an inertia-aware training scheduler.
DF-GCN Boosts Multimodal Emotion Recognition
DF-GCN introduces dynamic fusion in graph convolutional networks for multimodal emotion recognition in conversations (MERC). It integrates ODEs to capture evolving emotional dependencies and uses GIV prompts for adaptive multimodal feature fusion. Experiments on public datasets confirm superior performance and better generalization.
DeIllusionLLM Bridges LLM Know-Act Gap
LLMs show a know-act gap: they detect input flaws discriminatively but generate flawed answers. Researchers introduce FaultyScience benchmark and DeIllusionLLM, a task-level autoregressive framework using self-distillation to unify judgment and reasoning. It reduces error-ignoring responses while preserving reasoning ability.
ChatGPT Grades Interviews Better Than Humans
The author used ChatGPT to grade job interview answers and found it more useful than real interviews. ChatGPT effectively simulates and critiques interview performance.
AI Model Market Arbitrage
Arbitrageurs profit by allocating inference across AI model providers, undercutting market prices without development risk. Empirical study on SWE-bench shows 40% net margins using GPT-5 mini and DeepSeek v3.2. Arbitrage lowers prices, aids small providers, but hurts large model revenues.
AI Boom Reality: 6K Firms Show Low Productivity
NBER surveyed ~6,000 execs from 6K companies in US, UK, Germany, Australia. 70% have adopted some AI, but execs use it only 1.5 hours/week on average. Reveals gaps in employment outlook between management and employees.
AI Agents Excel with Private Language over LoT
This arXiv paper introduces the Efficiency Attenuation Phenomenon (EAP), where AI agents in MARL develop inscrutable protocols outperforming human-like symbolic languages by 50.5% in navigation tasks. It challenges the Language of Thought (LoT) hypothesis, arguing optimal cognition relies on sub-symbolic computations. The findings bridge AI, cognitive science, and philosophy with ethics implications.
Ex-Li Auto AI Chief Launches Robot Startup
Lang Xianpeng, ex-Li Auto smart driving VP, co-founds Beijing Kunlunxing Robot Tech with Ren Geng, focusing on embodied intelligence. Company registered March 2026, secured top funding pre-launch. Highlights talent exodus from ADAS to robotics amid 200B RMB sector investments.
Musk to Build $20B Chip Fab, 200B/Year for Space
Elon Musk plans to invest $20 billion in a self-built chip factory. The facility aims to produce 200 billion chips annually, with 80% operating in space, likely for satellite networks. This scales custom silicon for orbital computing infrastructure.
TikTok Tests Short-Drama Feed, Plans Originals
TikTok is testing a dedicated short-drama feed in the US and several other markets. The company has started casting for a soap opera-style short-drama project. It previously filed a US trademark for βTikTok Dramaβ covering content development and production.
Alibaba's Top RISC-V Chip for China AI Models
Alibaba has unveiled a new RISC-V server chip, claiming it as the most powerful processor ever using the RISC-V instruction set. The chip is optimized to run Chinaβs top AI models and sets performance records. However, it appears years behind Western processors.
Tencent Yuanbao Pai PC Version Launches
Tencent launched PC version of Yuanbao Pai from its AI-native app Yuanbao on March 25. Users can share screens while chatting in separate windows with friends or Yuanbao. Supports multi-end sync, file drag-and-drop, screenshots for efficient info flow.
2026 Embodied AI Shakeout: Profitability Purge Begins
The embodied intelligence field enters a consolidation phase in 2026. Humanoid robots failing to achieve profitability will face elimination first. This sets the stage for commercially viable robotics.