All Updates
Page 569 of 632
February 19, 2026
Fractal CEO Discusses AI Fears on IPO
Srikanth Velamakanni, CEO of India's first AI unicorn Fractal Analytics, shared views on AI fears and their impact on the company's recent IPO. Fractal raised $313 million last week. He discussed this in a Bloomberg interview.
ByteDance Music App Eyes NetEase Overtake
ByteDance's ζ±½ζ°΄ι³δΉ has surged to 1.4B MAU, nearing NetEase Cloud Music's 1.47B amid ByteDance's AI focus. It leverages ζι³ traffic (82% users) and bundled copyrights for 50M songs, plus AI features like singer 倧倴ι. NetEase struggles with community dilution, copyrights, and slower growth.
NVIDIA Teases Unprecedented Chips at GTC
NVIDIA CEO Jensen Huang revealed in an interview that the company will launch several globally unprecedented new chips at GTC 2026. The keynote is scheduled for March 15 in San Jose, California, focusing on the new era of AI infrastructure competition. Huang noted the challenges, as all technologies have reached their limits.
Global Recession: China Must Set AI-Era Agenda
Amid global economic stagnation with G7 growth under 1.2%, rising defaults, and AI's job displacement despite limited productivity gains, China should proactively set international agendas. Warns of prolonged recession, youth unemployment surge to 25% NEET rate, and AI stock bubbles risking crisis. Urges shifting from defense to offense in 'cognitive restart' era.
ServiceNow Flags AI Software Shakeup
ServiceNow COO Amit Zavery warns of software industry consolidation during AI transition. Firms failing to transform for AI adoption risk failure, per Bloomberg TV interview.
X Algo Pushes Conservative Content
A Nature study reveals X's 'For You' feed algorithm systematically prioritizes conservative political content and activists over liberal views and news media. This bias not only alters visible content but shifts users' political leanings toward conservatism over weeks. The findings highlight long-term ideological impacts of recommendation systems.
Microsoft Tests Ask Copilot in Windows 11 Taskbar
Microsoft is testing new AI features in Windows 11, including an 'Ask Copilot' entry in the taskbar and deep integration of Microsoft 365 Copilot into File Explorer. These enhancements aim to boost productivity without altering user habits. Rollout to all users expected in coming weeks.
Musk Predicts AI Binary Coding by 2026
Elon Musk predicts in a recent video that by the end of 2026, AI will directly write binary code, greatly reducing human reliance on programming languages. This could lead to full automation in the programming industry, potentially eliminating traditional programmers.
Verifiable Semantics for Agent Communication
Proposes a certification protocol using stimulus-meaning model to verify shared term understanding in multi-agent systems via tests on observable events. Core-guarded reasoning limits agents to certified terms, provably bounding disagreement. Simulations show 72-96% disagreement reduction; LLM validation achieves 51% drop.
Science of AI Agent Reliability
AI agents excel on benchmarks but fail in practice due to single-metric evaluations ignoring consistency, robustness, predictability, and safety. This arXiv paper proposes 12 concrete metrics across these four dimensions, grounded in safety-critical engineering. Tests on 14 agentic models across two benchmarks reveal only marginal reliability gains despite capability advances.
Proxy State Eval Scales LLM Agent Benchmarks
Proxy State-Based Evaluation introduces an LLM-driven simulation framework for benchmarking multi-turn tool-calling agents, avoiding costly deterministic backends. It uses scenarios to define goals and states, with LLM trackers inferring proxy states from traces for verification. The method yields stable rankings, high judge agreement over 90%, and transferable training data.
PAHF: Personalized Agents from Human Feedback
PAHF is a framework for continual personalization of AI agents, learning online from live human interactions via explicit per-user memory. It uses a three-step loop: pre-action clarification, preference-grounded actions, and post-action feedback for memory updates. Evaluated on new benchmarks for manipulation and shopping, it outperforms baselines in initial learning and adaptation to preference shifts.
Mirror Tops GPT-5 on Endo Board Exam
January Mirror, an evidence-grounded clinical reasoning system, scored 87.5% on a 120-question 2025 endocrinology board-style exam, outperforming human experts (62.3%) and frontier LLMs like GPT-5.2 (74.6%). It excelled on the hardest questions (76.7% accuracy) under closed-evidence constraints without web access. Outputs featured traceable citations from guidelines with 100% accuracy.
In-Context Inference Enables Multi-Agent Cooperation
Researchers demonstrate that sequence models' in-context learning induces cooperation in multi-agent RL without hardcoded co-player assumptions or timescale separation. Training against diverse co-players leads to best-response strategies on intra-episode timescales. This naturally emerges mutual shaping via extortion vulnerability, providing a scalable path to cooperative behaviors.
GPSBench Tests LLM GPS Reasoning
Researchers launch GPSBench, a 57,800-sample dataset across 17 tasks to probe LLMs' geospatial reasoning without tools. 14 SOTA LLMs show reliability in geographic knowledge but struggle with geometric computations like distance and bearing. Dataset, code, and findings reveal trade-offs in finetuning and augmentation benefits.
FoT: Dynamic LLM Reasoning Optimizer
FoT introduces a general-purpose framework for dynamic reasoning schemes in LLMs, overcoming static structures in Chain of Thought, Tree of Thoughts, and Graph of Thoughts. It features hyperparameter tuning, prompt optimization, parallel execution, and caching for better performance. The open-source codebase demonstrates faster execution, reduced costs, and improved task scores.
Corecraft RL Env Trains Generalizable Agents
Surge AI launches Corecraft, the first high-fidelity RL environment in EnterpriseGym, simulating enterprise customer support with 2,500+ entities and 23 tools. Training GLM 4.6 via GRPO improves task pass@1 from 25% to 37% on held-out tasks, with gains transferring to BFCL (+4.5%), ΟΒ²-Bench Retail (+7.4%), and Toolathlon (+6.8%). Results highlight task-centric design, expert rubrics, and realistic workflows as keys to generalization.
CaR Enables Efficient Neural Routing Constraints
Neural solvers excel in simple routing but falter on complex constraints. CaR introduces the first general framework using explicit learning-based feasibility refinement and joint training to generate diverse solutions for lightweight improvement. It outperforms SOTA solvers in feasibility, quality, and efficiency on hard constraints.
CAFE: Causal Multi-Agent AFE Breakthrough
CAFE reformulates automated feature engineering as a causally-guided sequential decision process using causal discovery for soft priors and multi-agent RL for construction. It outperforms baselines by up to 7% on 15 benchmarks and reduces performance drops 4x under covariate shifts. The framework produces compact, stable features with reliable attributions.
Boosting LLM Feedback-Driven In-Context Learning
Proposes a trainable framework for interactive in-context learning using multi-turn feedback from information asymmetry on verifiable tasks. Trained smaller models nearly match performance of 10x larger models and generalize to coding, puzzles, and mazes. Enables self-improvement by internally modeling teacher critiques.