All Updates
Page 129 of 865
April 17, 2026
Apple Watch Marketing Chief Retires
Appleβs marketing executive overseeing Apple Watch, AirPods, health, and smart home products is retiring. The departure signals a leadership transition for these vital product categories.
Weight Patching for LLM Interpretability
Weight Patching introduces a parameter-space method to localize LLM behaviors using paired base and specialized models. Applied to instruction following, it reveals a hierarchy from source carriers to execution circuits via a vector-anchor interface. It also enhances mechanism-aware model merging for better fusion.
WebXSkill Boosts Web Agent Skills
WebXSkill bridges the grounding gap for LLM-powered web agents with executable skills pairing action programs and natural language guidance. It uses three stages: skill extraction from synthetic trajectories, URL-graph organization, and dual deployment modes. Achieves up to 9.8% and 12.9% success rate gains on WebArena and WebVoyager; code open-sourced.
Uncertainty Quantification for LRMs
New method uses conformal prediction to quantify uncertainty in LRMs' reasoning traces and answers with statistical guarantees. Introduces Shapley value framework for explaining uncertainty origins via key training examples and steps. Theoretical analyses and experiments confirm effectiveness.
SciFi: Safe Autonomous AI for Science
SciFi introduces a safe, lightweight agentic AI framework for autonomous scientific tasks. It features an isolated execution environment, three-layer agent loop, and self-assessing do-until mechanism for reliable operation. This enables end-to-end automation of structured tasks using various LLMs with minimal human input.
RiskWebWorld: Realistic GUI Benchmark for E-commerce Risks
RiskWebWorld is the first realistic interactive benchmark for GUI agents in e-commerce risk management, featuring 1,513 tasks from production pipelines across 8 domains. It includes challenges like uncooperative websites and partial hijackments, with Gymnasium-compliant infrastructure for scalable evaluation and RL. Evaluations show top models at 49.1% success, highlighting scale's importance over zero-shot grounding.
ReSS: Symbolic Scaffolds for Tabular Reasoning
ReSS framework extracts decision paths from decision trees as symbolic scaffolds to guide LLMs in generating faithful reasoning for tabular data. It creates a high-quality dataset for fine-tuning LLMs, augmented for better generalization. Achieves up to 10% gains on medical/financial benchmarks with new faithfulness metrics.
NuHF Claw: Risk-Aware AI for Nuclear Rooms
NuHF Claw introduces a risk-constrained cognitive agent framework for digital nuclear control rooms, addressing cognitive risks from soft-controls. It couples cognitive state inference with real-time probabilistic safety assessment to regulate autonomous behavior. Simulator tests show it anticipates cognitive degradation, constrains unsafe recommendations, and preserves human authority.
Measurable Errors in LM Agent Explore/Exploit
Researchers design controllable 2D grid environments with DAG tasks to measure exploration and exploitation errors in LM agents policy-agnostically. Frontier models struggle with distinct failures, but reasoning models perform better and engineering improves both skills. Code is open-sourced on GitHub.
LLM Chaos from Numerical Instability
Researchers reveal how floating-point rounding errors cause unpredictability in LLMs through chaotic propagation in Transformer layers. They identify an 'avalanche effect' in early layers and three distinct regimes: stable, chaotic, and signal-dominated. Findings are validated across datasets and model architectures.
LAMO: Scalable Lightweight GUI Agents
LAMO framework empowers lightweight MLLMs for GUI automation via multi-role orchestration and task scalability. It features role-oriented data synthesis and two-stage training: Perplexity-Weighted Cross-Entropy SFT for knowledge distillation, plus RL for cooperative exploration. LAMO-3B supports monolithic and MAS execution, excelling as a plug-and-play executor with advanced planners.
CONCORD: Privacy-Safe Always-Listening AI
CONCORD is a privacy-aware A2A framework for proactive speech-based AI assistants that captures only owner speech via real-time verification, producing one-sided transcripts. It recovers missing context through spatio-temporal resolution, gap detection, and minimal relationship-aware A2A queries, avoiding hallucinations. Evaluations show 91.4% gap detection recall, 96% relationship classification, and 97% privacy TNR.
AI Detects Customer Harassment in Talks
Plus Alpha Consulting launches AI Kasuhara Guard to detect customer harassment (kasuhara) in face-to-face service. It transcribes conversations via speech recognition, visualizes interactions, and secures evidence trails. This aids customer-facing businesses in Japan.
Active Constraint Learning for Satellite Scheduling
Researchers introduce Conservative Constraint Acquisition (CCA) for optimizing Earth Observation satellite schedules under unknown operational constraints. Integrated into the Learn&Optimize framework, it interactively learns feasibility from a binary oracle while avoiding over-tightening. It outperforms baselines on synthetic instances up to 50 tasks, using fewer queries and less time.
JFTC Warns OS Providers on AI Exclusion
Japan's Fair Trade Commission released a generative AI market survey report on April 16, warning that smartphone OS providers restricting third-party AI app access may violate antitrust laws. It expressed concerns over US and Chinese giants potentially hindering fair competition for domestic firms in AI autonomous driving.
DeepSeek Cheaper as Cloud Prices Surge
AI inference costs dropped over 80% in 18 months, yet China's top three cloud providers announced price hikes in the same week. This signals a 2-3 year structural pricing battle. The article questions when this trend will end.
Japan MOJ Panels AI Video-Voice Infringements
Japan's Ministry of Justice announced on April 17 a study group to address escalating unauthorized AI-generated videos and audio mimicking celebrities' faces and voices. It will clarify civil liabilities, infringement criteria, and damage claims based on current laws and precedents.
Manycore Tech HK Debut Hits $4.1B Valuation
Manycore Tech has debuted on the Hong Kong stock exchange, achieving a valuation surpassing HK$32B (about $4.1B USD). The company is backed by prominent investors Shunwei and IDG. This launch highlights spatial intelligence as AI's emerging frontier.
QuantPai Brands as AI Species Pioneer
QuantPai upgrades brand to 'Intelligent Species' first stock with exclusive interview of MIT robotics PhD CTO. Emphasizes enduring value creation for long-term success in AI/robotics.
Spatial AI Stock Surges 171% on Debut
The first spatial intelligence stock skyrocketed 171% on opening day. It represents success in the track backed by AI pioneer Li Feifei and is one of Hangzhou's six little dragons. The era of spatial intelligence has just begun.