All Updates
Page 209 of 913
April 15, 2026
Self-Monitoring Needs Structural Integration
Self-monitoring modules like metacognition provide no benefit as auxiliary losses in continuous-time multi-timescale RL agents across various predator-prey environments. Structural integration of outputs into decision pathways yields marginal improvements in non-stationary settings but does not outperform baselines. Key lesson: self-monitoring must influence decisions directly, not run in parallel.
Science Stuck in Local Minima Like ML
Scientific knowledge forms local optima due to path dependence, cognitive biases, and institutional lock-in. Drawing ML gradient descent analogy, it follows tractable local gradients over global truths. Proposes meta-scientific interventions to escape traps.
Memory Worth: Agent Memory Governance
This arXiv paper proposes Memory Worth (MW), a lightweight two-counter metric per memory that tracks co-occurrence with task successes versus failures in agent systems. It converges theoretically to the conditional success probability and shows strong empirical correlation (rho=0.89) with true utilities. The approach enables staleness detection, retrieval suppression, and deprecation with minimal overhead.
Memory as Metabolism for Companion LLMs
This arXiv paper proposes 'Memory as Metabolism,' a design for companion knowledge systems that mirrors user knowledge while compensating for epistemic failures like entrenchment. It outlines five core operationsβTRIAGE, DECAY, CONTEXTUALIZE, CONSOLIDATE, AUDITβbacked by memory gravity and minority-hypothesis retention. The framework addresses governance for single-user LLM wikis in a 2026 landscape of emerging agent memory systems.
Identity as Attractor in LLM Activation Space
Large language models exhibit attractor-like dynamics where semantically related prompts map to similar representations. Experiment on Llama 3.1 8B shows agent identity documents (cognitive_core) cause paraphrases to cluster tighter than controls in hidden states. Replicated on Gemma 2 9B, with evidence it's semantic and reading agent descriptions shifts states toward the attractor.
Human-Like Selective Memory for Social Robots
This arXiv paper introduces a human-inspired context-selective multimodal memory architecture for social robots, capturing textual and visual episodic memories based on emotional salience or novelty. It outperforms human consistency in selective storage (Spearman Ο=0.506) and boosts multimodal retrieval Recall@1 by 13%. The system enables personalized, natural human-robot interactions with real-time performance.
HORIZON Diagnoses LLM Agent Long-Horizon Failures
Introduces HORIZON benchmark to systematically diagnose long-horizon failures in LLM agents across domains. Evaluates SOTA models like GPT-5 variants and Claude on 3100+ trajectories, revealing degradation patterns. Releases leaderboard and LLM-as-a-Judge pipeline validated with human annotations (ΞΊ=0.84).
GoodPoint: LLM Constructive Paper Feedback
GoodPoint curates a 19K ICLR paper dataset annotated with reviewer feedback using author responses, defining effectiveness via validity and author action. It introduces a training recipe with fine-tuning and preference optimization, boosting Qwen3-8B's success rate by 83.7% and achieving SOTA among similar LLMs. Expert human studies confirm higher practical value for authors.
Framework for Longitudinal Health AI Agents
Researchers propose a multi-layer framework and agent architecture for AI supporting longitudinal health tasks like symptom management and patient support. It operationalizes adaptation, coherence, continuity, and agency across repeated interactions. Use cases show sustained engagement, goal adaptation, and safe personalized decision-making.
ArcDeck: Narrative Paper-to-Slide AI
ArcDeck is a multi-agent framework that generates slides from academic papers by modeling logical flow via discourse trees and global commitment documents. Specialized agents iteratively refine outlines before rendering visuals. It introduces ArcBench, a new benchmark showing improved narrative coherence.
Nvidia AI Models Boost Quantum Stocks
Nvidia unveiled new open-source AI models designed to accelerate quantum computing progress. This announcement triggered a surge in Asian software and IT stocks focused on quantum computing.
Alibaba, ByteDance Target Zhipu, MiniMax Pricing
Alibaba and ByteDance are aggressively pursuing Zhipu AI and MiniMax in a competitive landscape. The core issue is who controls AI token pricing. Tokens have unlimited potential, but pricing logic remains constrained.
E-Waste Memory Outvalues Gold in 100 Days
In 2026, rural China sees a rush of e-waste collectors targeting scrapped machines. Memory chips are valued more than gold in this 100-day market frenzy. Highlights ongoing hardware scarcity.
Didi AV Speeds Global Push with UAE Pilot This Year
Didi Autonomous Driving accelerates global expansion with a pilot launch in UAE this year. Emphasizes responsible innovation via local partnerships. Aims to deploy Chinese AV tech and services worldwide.
Curity's Runtime Auth for AI Agents
Curity launches Access Intelligence, extending its IAM platform for securing autonomous AI agents via runtime authorization. Uses OAuth tokens with purpose/intent data for ephemeral access. Addresses gaps in traditional IAM for non-deterministic agent actions.
DGX Spark Setup for vLLM Local Inference
A user unboxes NVIDIA DGX Spark for on-premises LLM inference using vLLM, PyTorch, and Hugging Face models in an education app. They seek advice on optimal models, vLLM tuning for unified memory, and real-world throughput. This marks a shift from cloud GPUs to local setups.
OpenAI Leaks Anthropic Revenue Inflation Mockery
OpenAI sent an internal letter mocking Anthropic's Claude annualized revenue as inflated by 80 billion. The letter, leaked publicly, accuses Anthropic of padding income figures. This escalates rivalry between the AI giants.
llama.cpp Hot Expert Cache Speeds MoE 27%
llama.cpp introduces dynamic expert cache that loads frequent MoE experts into VRAM, boosting Qwen3.5-122B-A10B token generation by 27% over layer offload (22.67 tok/s on RTX 4090). It outperforms all-CPU by 45% with similar VRAM use. Code repo provided for testing.
MS Image Model Cuts Price 41% Again
Microsoft slashed prices on its self-developed image model by 41%. CEO Nadella is redefining AI models through gross margins. Suleyman's compute cost reductions intensify pressure on OpenAI.
Gartner: AI Mainframe Bubble to Pop
Gartner predicts 70% of AI-powered mainframe migration projects will fail. 75% of vendors in this space are expected to disappear. Mainframe users relying on AI for legacy code migration face high disappointment risks.