All Updates
Page 555 of 647
February 24, 2026
Vibe Coding Verifies CAS Adaptation Automatically
Researchers use generative LLMs and vibe coding feedback loops to generate and verify Adaptation Managers for Complex Adaptive Systems. They introduce FCL, a novel temporal logic offering finer granularity than LTL for constraint-based verification. Experiments on CAS examples show high coverage with just a few iterations.
TEB Boosts Visual RL Exploration
TEB introduces task-aware exploration for visual reinforcement learning under sparse rewards using a predictive bisimulation metric. It learns behaviorally grounded representations and measures intrinsic novelty in latent space. Experiments on MetaWorld and Maze2D show it outperforms baselines.
Spilled Energy Detects LLM Hallucinations
Researchers reinterpret LLM softmax classifiers as Energy-Based Models (EBMs) to track 'energy spills' during decoding, correlating with factual errors, biases, and hallucinations. They introduce two training-free metrics from output logits: spilled energy (discrepancy across steps) and marginalized energy (single-step). Evaluated on nine benchmarks with LLaMA, Mistral, Gemma, and Qwen3, it achieves robust detection without probes or training.
Smarsh Archie AI Boosts Self-Service 59%
Smarsh launched Archie, an AI support agent built on Salesforce Agentforce, acting as a 'front door' for regulated industries to simplify support across products. It centralizes knowledge for plain-language queries, driving 59% self-service adoption. Results include 20% higher success rates, 25% faster resolutions, and 30% productivity boost.
Physics Forces Symbolic AI Semantics
Challenges static semantics in visual AI as incomplete, proposing dynamic Observation Semantics Fiber Bundle for bounded agents. Proves thermodynamic limits via Landauer's Principle create Semantic Constant B, requiring phase transition to discrete symbolic structures. Language emerges as ontological necessity to prevent information collapse.
Multimodal Agent Unlocks Deeper Chart Insights
Chart Insight Agent Flow is a new multi-agent framework that uses MLLMs to extract profound insights from chart images, surpassing basic descriptions. It introduces ChartSummInsights, a dataset of real-world charts with expert-written insightful summaries. Experiments show it significantly boosts MLLM performance on chart summarization.
LaDa: Federated Reasoning Distillation Framework
LaDa is a federated reasoning distillation framework addressing bidirectional learnability gaps between LLMs and SLMs via adaptive data allocation. It uses a model learnability-aware filter to select high-reward samples for effective knowledge transfer. Domain-adaptive distillation aligns reasoning paths through contrastive learning, acting as a plug-in for existing frameworks.
HRDL: Language Rewards Align RL Agents
HRDL extends reward design to encode nuanced human preferences for hierarchical RL agents in complex tasks. L2HR translates language specs into hierarchical rewards. Experiments demonstrate superior task completion and adherence to human specifications.
GEARS: Agentic Framework for Ranking Optimization
GEARS reframes large-scale ranking as an agentic discovery process in a programmable environment, addressing engineering bottlenecks over modeling limits. It encapsulates expert knowledge into reusable agent skills for high-level intent steering and includes validation hooks for robust, stable policies. Experiments across product surfaces show superior near-Pareto-efficient outcomes.
AI Analysts Reveal Data Analysis Diversity
Autonomous AI analysts powered by LLMs replicate human 'many-analyst' studies scalably, producing diverse outcomes on the same dataset. They independently build analysis pipelines, showing dispersion in effect sizes, p-values, and hypothesis support. Variability is structured and steerable via LLM models and personas, validated by an AI auditor.
FARS Outputs 100 Papers in 228 Hours
Analemma's Fully Automated Research System (FARS) ran for 228 hours straight, generating 244 hypotheses and 100 short papers at a rate of one every 2 hours. It consumed 11.4 billion tokens during this public demo. The month-long live stream continues online.
Qianxun Raises $2.8B, Valuation Tops $14B
Embodied AI firm Qianxun Intelligent secured nearly 20B RMB in two funding rounds from top VCs, industry giants, and state funds. Valuation exceeds 100B RMB, entering the 'billion club'. Spirit v1.5 model outperforms Pi0.5 with data pyramid training.
DeepMind Debunks More Agents Always Better
DeepMind's paper evaluates 180 agent configs across 5 architectures, revealing scaling limits where more agents hurt performance if mismatched to tasks. Tested on Finance-Agent, web nav, planning benchmarks with GPT/Gemini/Claude. Introduces quantitative principles for agent systems.
DeepSeek GitHub Updates Trigger Wall Street Fear
DeepSeek is continuously updating its GitHub repositories, reigniting fears among Wall Street and the US AI community of being dominated. This activity signals the potential arrival of a 'DeepSeek second moment' as the company resumes work post-holiday.
MIT PhysiOpt Makes 3D Gen Models Manufacturable
MIT's PhysiOpt optimizes 3D generative models directly in latent space using differentiable physics, treating implicit fields as density distributions. Avoids costly remeshing, enabling editable, structurally sound designs. Accepted to SIGGRAPH Asia 2025.
AIΒ² Robotics Raises $140M Series B at $1.4B Valuation
AIΒ² Robotics raised over USD 140 million in Series B funding. The round pushed its valuation beyond USD 1.4 billion. Investors call it China's most Tesla-like robotics startup.
OpenClaw Mishaps Warn of AI Agent Risks
Recent mishaps with OpenClaw and other AI assistants demonstrate the dangers of granting too much authority to automated agents. The article argues these incidents should terrify those believing AI is ready for real-world responsibilities.
Kimi Smashes Revenue Record, Hits $10B Valuation Fastest
Moonshot AI's Kimi generated more revenue in 20 days than all of last year. It reached USD 10 billion valuation in just over two years. This makes it China's fastest decacorn.
8B RMB AI Battle: Spring Festival Winners?
Internet giants spent 8 billion RMB in an intense AI competition during Spring Festival. Described as the 'most cyberpunk' holiday, it highlights heavy bets on the AI new world. The article questions who came out on top.
Robot Rentals Viral, Booked a Month Ahead
Robot performance rentals surged in popularity during Spring Festival. Bookings filled up a month in advance. Daily rental rates climbed to about USD 1,400.