All Updates

Page 602 of 609

February 12, 2026

๐Ÿ“„
ArXiv AIโ€ข52d ago

GRU-Mem Optimizes Long-Context LLM Reasoning

GRU-Mem introduces text-controlled gates to MemAgent for efficient long-context reasoning, preventing memory explosion and unnecessary computation. Update and exit gates manage recurrent memory loops via RL rewards. Achieves up to 400% faster inference on reasoning tasks.

#research#gru-mem#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

Generative Framework for Brain Infarct Masks

Introduces an anatomy-preserving method using VAE and latent diffusion to generate multi-class brain segmentation masks from NCCT data. It learns anatomical latents from masks only, generating realistic samples with optional lesion control. Avoids artifacts seen in pixel-space models.

#research#latent-diffusion#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

GenAI Framework for Higher Ed

Surveys reveal divided stakeholder perceptions of GenAI in IT/EE disciplines at University of Oulu. Proposes conceptual framework with high-level requirements for responsible integration. Ensures EU AI Act compliance and addresses privacy, integrity concerns.

#research#genai-framework#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

GameDevBench Evaluates Game Dev Agents

GameDevBench offers 132 multimodal game development tasks from tutorials. Agents struggle, with top solving 54.5%; tasks demand code and asset handling. Simple image/video feedback improves performance up to 47.7%.

#research#gamedevbench#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

FPT Bayesian Nets via Feedback Edges

Analyzes parameterized complexity of Bayesian Network Structure Learning using superstructure. Proves fixed-parameter tractability with feedback edge set parameterization. Extends to treewidth with additive representations and polytree learning.

#research#bns-l#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

FoSS: GFlowNets for Dynamic Span LMs

FoSS introduces a GFlowNets framework for generating text via dynamic span vocabularies in a DAG-structured state space. It enables flexible segmentation of retrieved text and explores diverse compositional paths. Empirically, it boosts MAUVE scores by 12.5% and excels in knowledge tasks.

#research#foss#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

FormalJudge Ensures Agent Safety

FormalJudge uses neuro-symbolic bidirectional reasoning to translate intents into verifiable specs. It employs Dafny and Z3 for mathematical guarantees over probabilistic judging. Achieves 16.6% gains and detects deception effectively.

#research#formaljudge#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

FlowCache Accelerates Autoregressive Video Gen

FlowCache is a caching framework for autoregressive video models, using chunkwise policies and KV cache compression. Achieves 2.38x speedup on MAGI-1 and 6.7x on SkyReels-V2 with minimal quality loss. Code available on GitHub.

#research#flowcache#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

First Analysis of AI Agent Social Network

Moltbook, the first social network for AI agents, shows viral growth and diversification into promotional and political topics. Analysis of 44k posts reveals topic-dependent toxicity, especially in incentive and governance areas. Highlights risks like anti-humanity rhetoric and bursty automation flooding.

#research#moltbook#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

FIRE: Latent Space Backdoor Mitigation at Runtime

FIRE mitigates backdoors in deployed neural networks by reversing trigger-induced latent space directions. It manipulates features along backdoor paths to neutralize triggers during inference. Outperforms baselines with low overhead on image tasks.

#research#fire#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

FASCL Future-Aligns Asset Retrieval

FASCL employs future-aligned soft contrastive learning using pairwise return correlations as supervision for financial asset retrieval. It outperforms historical similarity baselines on US equities. Includes protocol to evaluate future trajectory alignment.

#research#fascl#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

FAC Synthesizes Diverse LLM Data

Feature Activation Coverage (FAC) measures diversity in LLM feature space using sparse autoencoders. FAC Synthesis generates samples targeting missing features from seed data. Boosts diversity and performance on instruction, toxicity, reward, and steering tasks.

#research#fac-synthesis#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

Evidence Alignment Bottleneck Exposed

Decomposition boosts claim verification only with granular, sub-claim aligned evidence; repeated claim-level evidence degrades performance. Noisy sub-claim labels propagate errors unless using conservative abstention. New dataset features annotated evidence spans.

#research#claim-verification#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

Evaluating Agentic AI Gaps in Drug Discovery

Researchers evaluate agentic systems for drug discovery across 15 task classes, identifying five key capability gaps like lack of protein models and safety trade-offs. A knowledge-probing experiment reveals architectural bottlenecks in current frameworks. They propose design requirements and a capability matrix for next-gen systems.

#research#beyond-smiles#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

ERGO Boosts Monocular 3D Splatting

Introduces ERGO framework for robust 3D Gaussian splatting from single images. Uses excess risk decomposition to adapt loss weights against noisy views. Adds geometry and texture objectives for fidelity.

#research#ergo#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

Equivariant Uncertainty for Interatomic Potentials

Introduces eยฒIP, an equivariant evidential deep learning framework for ML interatomic potentials in molecular dynamics. Models atomic forces and uncertainties via 3x3 covariance tensors that rotate equivariantly. Outperforms ensembles in accuracy, efficiency, and data efficiency.

#research#e2ip#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

ENIGMA: EEG-to-Image in 15 Mins

ENIGMA decodes images from EEG with <1% params of priors, achieving SOTA on THINGS-EEG2 and consumer benchmarks. Fine-tunes on new subjects in 15 minutes using simple spatio-temporal backbone and latent alignment. Includes behavioral human evaluations.

#research#enigma#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

ECHO Platform for AI-Human Studies

ECHO is an open platform for reproducible human-AI interaction research. Supports chat, search sessions, surveys, tasks in low-code setup. Exports datasets for HCI, IR analysis.

#research#echo#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

Dynamic Contamination-Free Medical Benchmark

LiveMedBench offers weekly updated real-world clinical cases for LLM evaluation, avoiding contamination via temporal separation. Multi-agent curation ensures integrity; automated rubric evaluation aligns with experts better than alternatives. Tests reveal top LLMs at 39.2%, highlighting contextual gaps.

#research#livemedbench#v1
๐Ÿ“„
ArXiv AIโ€ข52d ago

Dissecting Moltbook's Non-Human Social Graph

Early Moltbook data from 6k agents shows power-law participation and small-world connectivity like human networks. Micro patterns are alien: shallow threads, low reciprocity, 34% duplicate templates. Dominated by identity language and phrases like 'my human'.

#research#moltbook#v1
Page 602 of 609