All Updates
Page 579 of 623
February 17, 2026
Unity AI Beta Generates Games via Natural Language
Unity plans to launch a Unity AI beta at GDC in March, enabling natural language prompts to create complete casual games without coding. The tool integrates top LLMs and custom models to streamline prototyping to production. It aims to empower non-programmers and boost developer efficiency.
SpaceX Enters Pentagon AI Drone Race
SpaceX and its xAI subsidiary are competing in a classified Pentagon program for voice-controlled autonomous drone swarms. Elon Musk's recent company merger thrusts them into AI-enabled weapons development. The move into military AI could provoke significant debate.
Micron's $200B Factory Push Breaks AI Memory Bottleneck
Micron Technology plans $200 billion investment in new factories to tackle the worst storage chip shortage in 40 years. The expansion targets AI memory constraints, powering data storage for smartphones, autos, laptops, and data centers.
Unitree Chunwan Robot Video Goes Viral Overseas
Unitree humanoid robots performed on the Spring Festival Gala stage. The official video from Unitree reached nearly 100,000 views in under 10 hours overseas. Overseas netizens expressed shock at the performance in comments.
Spring Fest Box Office Hits 10B, AI Orders Surge
2026 Spring Festival box office surpassed 10 billion yuan including pre-sales by Feb 17. Qianwen data shows AI ticket buys on Damai jumped 372x in two days. Tier 3/4 city AI orders skyrocketed 782x.
Robotaxi 19% Availability After 8 Months
Tesla's Robotaxi service launched in Austin 8 months ago now has just 19% availability. This lags far behind Elon Musk's prior commitments. Multiple core operational metrics fall short of targets.
AI Climate Claims Branded Greenwashing
A report dismisses tech industry claims that AI can fix climate issues as greenwashing. Most references are to traditional machine learning, not energy-intensive generative AI like chatbots and image tools. Explosive growth fuels massive datacenter energy demands.
X-Blocks: Linguistic Blocks for AV Explanations
X-Blocks introduces a hierarchical framework analyzing natural language explanations for automated vehicles (AVs) at context, syntax, and lexicon levels. RACE, a multi-LLM ensemble with Chain-of-Thought and self-consistency, achieves 91.45% accuracy on Berkeley DeepDrive-X dataset. It uncovers scenario-specific vocabulary patterns and reusable grammar families for explainable AI.
VeRA: Verified Reasoning Data Augmentation
VeRA converts benchmark problems into executable specifications—templates, generators, and verifiers—to create unlimited verified variants at near-zero cost. VeRA-E generates equivalent problems to detect memorization, while VeRA-H hardens tasks for fresh challenges. Evaluated on 16 frontier models and fully open-sourced.
VeRA: Scalable Verified Reasoning Data Augmentation
VeRA is a framework that transforms static benchmark problems into executable specifications for generating unlimited verified variants. It features VeRA-E for equivalent rewrites to detect memorization and VeRA-H for hardened tasks at intelligence frontiers. The tool is open-sourced with code and datasets after evaluating 16 frontier models.
VaryBalance: Top LLM Text Detector
VaryBalance detects LLM-generated text by exploiting greater variation between human texts and their LLM-rewritten versions versus LLM texts. It quantifies this via mean standard deviation for robust distinction. Experiments show it beats state-of-the-art like Binoculars by up to 34.3% AUROC across models and languages.
Trajectory-Dominant Pareto Optimization for Intelligence
AI systems stagnate in long-horizon adaptability due to trajectory-level Pareto traps, not data or capacity limits. The paper introduces Trajectory-Dominant Pareto Optimization, defining dominance over full trajectories, and Pareto traps as local optima blocking global paths. It proposes the Trap Escape Difficulty Index (TEDI) and a taxonomy to diagnose intelligence ceilings.
SSLogic Scales Logic via Agentic Synthesis
SSLogic is an agentic meta-synthesis framework that scales logical reasoning tasks at the family level using iterative Generate-Validate-Repair loops for Generator-Validator pairs. It features a Multi-Gate Validation Protocol with adversarial blind reviews by independent agents to ensure data reliability. Training on evolved data boosts benchmarks like SynLogic by +5.2 points.
SELFCEST: Learned Parallel Model Clones
SELFCEST equips base language models to spawn same-weight clones in parallel contexts via agentic reinforcement learning. It trains end-to-end with global task rewards and shared-parameter rollouts to allocate budgets across branches. This improves accuracy-cost Pareto frontiers on math reasoning and long-context QA benchmarks with OOD generalization.
PlotChain Benchmark for MLLM Plot Reading
PlotChain introduces a deterministic benchmark for evaluating multimodal LLMs on extracting quantitative values from engineering plots like Bode and FFT. It features 450 plots across 15 families with ground truth and checkpoint diagnostics for failure analysis. Top models score ~80% (Gemini 2.5 Pro leads), but frequency tasks remain weak.
NL2LOGIC: 99% Accurate NL-to-FOL Translation
NL2LOGIC is a new framework using abstract syntax trees (AST) to translate natural language into first-order logic via large language models. It combines a recursive LLM semantic parser with an AST-guided generator for high syntactic accuracy and semantic faithfulness. Benchmarks show 99% syntactic accuracy, up to 30% semantic gains, and 31% reasoning improvement when integrated with Logic-LM.
MAPLE: Sub-Agent Design for AI Personalization
MAPLE decomposes LLM agent limitations by separating memory, learning, and personalization into dedicated sub-agents. Memory manages storage/retrieval, Learning extracts insights asynchronously, and Personalization applies them in real-time. It boosts personalization scores by 14.6% and trait incorporation from 45% to 75% on MAPLE-Personas benchmark.
Lang2Act Boosts VLM Visual Reasoning with Emergent Tools
Lang2Act enhances Vision-Language Models (VLMs) via self-emergent linguistic toolchains for fine-grained visual perception in VRAG, avoiding rigid external tools and info loss from image ops. It employs a two-stage RL framework: first to build a reusable action toolbox, second to exploit it for reasoning. Achieves >4% performance gains; code at GitHub.
Geometric Taxonomy of LLM Hallucinations
Researchers propose a geometric taxonomy classifying LLM hallucinations into three types: unfaithfulness, confabulation, and factual error. Benchmark hallucinations show strong domain-local detection but fail cross-domain, while human-crafted confabulations enable a single global detection direction. Factual errors remain undetectable via embeddings due to distributional encoding limits.
Dual-Cycle Framework for Safe Role-Playing LLMs
A training-free Dual-Cycle Adversarial Self-Evolution framework addresses jailbreak vulnerabilities in LLM role-playing agents. It couples a Persona-Targeted Attacker cycle for stronger jailbreaks with a Role-Playing Defender cycle that distills failures into a hierarchical safety knowledge base. At inference, it retrieves structured knowledge to ensure in-character yet safe responses, outperforming baselines in fidelity and resistance.