All Updates

Page 778 of 789

February 12, 2026

Universal Multimodal Immune System Model

EVA is a cross-species, multimodal foundation model harmonizing transcriptomics and histology for immunology. It shows scaling laws and SOTA on 39 tasks from discovery to clinical trials. Open version released for transcriptomics research.

#research#eva#v1

📄

ArXiv AI•69d ago

Unified Theory for Sketching Influence Functions

Develops theory for random projections in computing influence functions, covering unregularized, regularized, and factorized cases. Shows exact preservation conditions and handles out-of-range gradients via leakage term. Guides sketch size selection for scalable computation.

#research#influence-functions#random-projection

📄

ArXiv AI•69d ago

TwiFF Enables Dynamic Visual CoT

TwiFF-2.7M dataset and model advance VCoT for videos via future frame generation. TwiFF-Bench evaluates reasoning trajectories. Outperforms baselines on dynamic VQA.

#research#twiff#v1

📄

ArXiv AI•69d ago

Transformers Collapse to Low-Dim Manifolds

Transformer training on modular arithmetic tasks collapses high-dimensional parameters to 3-4D execution manifolds. This structure explains attention concentration, SGD integrability, and sparse autoencoder limits. Core computation occurs in reduced subspaces amid overparameterization.

#research#arxiv-ai#v1

📄

ArXiv AI•69d ago

Transformer for Experimental NMR Structure Elucidation

NMRTrans uses set transformers on experimental NMR spectra for molecular structure elucidation, trained on NMRSpec corpus from literature. It models spectra as unordered peak sets aligning with NMR physics. Achieves SOTA Top-10 accuracy of 61.15% on benchmarks.

#research#nmrtrans#v1

📄

ArXiv AI•69d ago

Topology Meets NNs Under Uncertainty

Integrates neural networks, topological data analysis, and Bayesian methods for AI in military domains. Covers image, time-series, graph applications like fraud detection. Emphasizes robustness and interpretability.

#research#topological-nn#v1

📄

ArXiv AI•69d ago

Tokens Enable Emergent Resource Rationality

Inference-time scaling in language models leads to adaptive resource rationality without explicit cost rewards. Models shift from brute-force to analytic strategies as task complexity rises. LRMs show robustness on challenging functions like XOR/XNOR unlike IT models.

#research#language-models#v1

📄

ArXiv AI•69d ago

TokaMark Launches Fusion Plasma Benchmark

TokaMark standardizes AI evaluation on MAST tokamak data with unified multi-modal access and 14 tasks. Harmonizes formats, metadata, and protocols for reproducible comparisons. Includes baseline model; fully open-sourced for community use.

#launch#tokamark#v1

📄

ArXiv AI•69d ago

Text Boosts Multimodal Anomaly Detection

Text-guided framework enhances weakly supervised multimodal video anomaly detection. Employs in-context learning for anomaly text augmentation and multi-scale bottleneck Transformer for fusion. Achieves state-of-the-art on UCF-Crime and XD-Violence benchmarks.

#research#text-guided#v1

📄

ArXiv AI•69d ago

δ_TCB Measures LLM Prediction Stability

Introduces δ_TCB metric to quantify LLM internal state robustness against perturbations, beyond traditional accuracy. Linked to output embedding geometry, it reveals prediction instabilities missed by perplexity. Correlates with prompt engineering in in-context learning.

#research#delta-tcb#v1

📄

ArXiv AI•69d ago

Synthetic Underspecification for Agents

LHAW generates controllable underspecified long-horizon tasks by removing info across goals, constraints, inputs, context. Validates via agent trials, classifying ambiguity impacts. Releases 285 variants from benchmarks.

#research#arxiv-ai#v1

📄

ArXiv AI•69d ago

SynergyKGC Handles KG Heterogeneity

SynergyKGC fuses entity semantics with heterogeneous topologies via cross-modal synergy. Uses density-dependent anchoring and double-tower consistency. Improves KGC hit rates on benchmarks.

#research#synergykgc#v1

📄

ArXiv AI•69d ago

Step 3.5 Flash: Efficient Frontier AI

Step 3.5 Flash is a 196B MoE model with 11B active params for agentic tasks. Optimized with sliding-window attention and MTP-3 for low-latency inference. Matches frontier models on math, code, and agent benchmarks.

#research#step-35-flash#v1

📄

ArXiv AI•69d ago

Stats Test Spots LLM Degradations

McNemar's test framework detects post-optimization LLM degradations via per-sample comparisons. Aggregates across benchmarks with controlled false positives. Flags 0.3% drops confidently.

#research#llms#mcnemar

📄

ArXiv AI•69d ago

Silence Boosts Collective Taste Judgment

Introduces Silence Routing framework for collective intelligence in taste domains using music preferences. Specifies when contributors should speak, report, or stay silent. Simulation shows accuracy gains over baselines only when silence is allowed.

#research#silence-routing#v1

📄

ArXiv AI•69d ago

SigLIP Boosts Multi-Label ECG Classification

Adapts SigLIP contrastive learning with a Jaccard-based sigmoid loss for multi-label ECG classification using real-world data. Incorporates medical knowledge and techniques like higher embedding dimensions and random cropping. Per-label analysis identifies prediction challenges across ECG findings.

#research#siglip-ecg#v1

📄

ArXiv AI•69d ago

Semantic Labels Enhance TPRA Retrieval

Explores semantic labeling for TPRA questionnaires using LLMs and hybrid SSSL. Compares direct labeling vs. clustering and propagation. Improves retrieval when labels are discriminative.

#research#sssl-pipeline#v1

📄

ArXiv AI•69d ago

Self-Supervised SR Quality Assessor

Proposes no-reference IQA for real-world super-resolved images using content-free SSL. Pretrains multi-SR model representations via contrastive learning. Includes new SRMORSS dataset for pretext training.

#research#s3-riqa#v1

📄

ArXiv AI•69d ago

ScratchWorld Tests GUI Agents

Introduces ScratchWorld benchmark with 83 tasks for multimodal GUI agents in Scratch. Uses primitive/composite modes and execution-based evaluation. Exposes reasoning-acting gaps in state-of-the-art agents.

#research#scratchworld#v1

📄

ArXiv AI•69d ago

Safety Alignment for Omni-Modal LLMs

OmniSteer addresses cross-modality vulnerabilities in OLLMs using AdvBench-Omni dataset and modality-semantics decoupling. Uncovers mid-layer dissolution and extracts golden refusal vector via SVD. Boosts refusal rate to 91.2% while preserving capabilities.

#research#omnisteer#v1

1777 778 779789

Page 778 of 789

Back to Home