All Updates

Page 575 of 626

February 18, 2026

📄
ArXiv AI48d ago

Hybrid Abstention Boosts LLM Reliability

This arXiv paper introduces an adaptive abstention system for LLMs that dynamically adjusts safety thresholds using contextual signals like domain and user history. It features a multi-dimensional detection architecture with five parallel detectors in a hierarchical cascade, reducing latency and false positives. Evaluations show strong performance in sensitive domains like medical advice.

#abstention#cascade-detection#safety-guardrails
📄
ArXiv AI48d ago

EduEVAL-DB Dataset for AI Tutor Evaluation

EduEVAL-DB introduces a dataset of 854 explanations for 139 ScienceQA questions across K-12 subjects, with one human-teacher and six LLM-simulated teacher explanations. It features a pedagogical risk rubric covering factual correctness, depth, focus, appropriateness, and bias, annotated via semi-automatic expert review. Preliminary benchmarks compare Gemini 2.5 Pro against fine-tuned Llama 3.1 8B for risk detection on consumer hardware.

#dataset#education-ai#pedagogical-risk
📄
ArXiv AI48d ago

EAA Automates Microscopy with VLM Agents

Experiment Automation Agents (EAA) is a vision-language-model-driven system that automates complex microscopy workflows in materials characterization. It combines multimodal reasoning, tool actions, and long-term memory for autonomous or user-guided experiments. Demonstrated at Advanced Photon Source, it handles focusing, feature search, and data acquisition to boost efficiency.

#agentic-systems#scientific-ai
📄
ArXiv AI48d ago

Common Belief Defies KD4: New Axioms

Contrary to common belief, common belief is not KD4 under KD45 individual beliefs, retaining only D and 4 properties plus shift-reflexivity C(Cφ → φ). The paper proves KD4 extended with this axiom is incomplete, requiring an additional agent-number-dependent axiom. This fully characterizes common belief, settling a long-open problem.

#epistemic-logic#kd45#multi-agent
📄
ArXiv AI48d ago

AI Predicts Invoice Dilution with Leakage-Free XGBoost & KAN

This ArXiv paper proposes an AI/ML framework to predict invoice dilution in supply chain finance, mitigating non-credit risks and margin losses. It employs leakage-free two-stage XGBoost, Kolmogorov-Arnold Networks (KAN), and ensemble models trained on production data across nine transaction fields. The method supports real-time dynamic credit limits, reducing reliance on buyer's irrevocable payment undertakings (IPU).

#supply-chain-finance#data-leakage#fintech-risk
📄
ArXiv AI48d ago

AgriWorld: LLM Agents for Verifiable Agri Reasoning

Researchers introduce AgriWorld, a Python execution environment with unified tools for geospatial queries, remote-sensing analytics, crop simulations, and agri predictors. Agro-Reflective LLM agent uses an execute-observe-refine loop for multi-turn reasoning over agricultural data. Evaluated on new AgroBench benchmark, it outperforms text-only and direct tool-use baselines.

#llm-agents#code-execution#agriculture-ai
🦞
OpenClaw.report48d ago

Steinberger's OpenClaw Vision as Constitution

Peter Steinberger left a VISION.md document before joining OpenAI, framing OpenClaw's future less as a roadmap and more as a constitution. The article delivers a line-by-line examination of its contents.

#vision-doc#ai-agents#manifesto
🦞
OpenClaw.report48d ago

OpenClaw v2026.2.17: 1M Context + Sonnet 4.6

OpenClaw v2026.2.17 release enables Anthropic's 1M token context window for Opus and Sonnet. It introduces Sonnet 4.6 support alongside extensive updates to iOS, Slack, Telegram, Discord, and cron systems.

#context-window#model-integration#multi-platform
🇬🇧
The Register - AI/ML48d ago

Palo Alto CEO: AI Lags in Enterprise

Palo Alto Networks CEO Nikesh Arora reports minimal enterprise AI adoption, limited mainly to coding assistants. Business use trails consumer adoption by at least two years. The company acquired Koi to gear up for future AI developments.

#enterprise-adoption#coding-assistants#ceo-commentary
🧧
Qwen (GitHub Releases: qwen-code)48d ago

Qwen-Code v0.10.4: Fixes & Region Support

Qwen-Code released v0.10.4 with a news banner announcing Qwen3.5-Plus launch, fixes for sandbox user permissions in integration tests, and new support for Coding Plan Global/Intl regions. It also bumps the version from 0.10.3, with full changelog available.

#sandbox-fix#region-support#integration-tests
🏠
IT之家48d ago

Spain Probes X, Meta, TikTok on AI CSAM

Spain's government demands investigation into X, Meta, and TikTok for allegedly using AI to create and spread child sexual abuse material. PM Sanchez accuses platforms of harming children's rights and vows to end their impunity. This follows plans to ban under-16s from social media.

#regulation#csam#generative-ai
🔥
36氪48d ago

YouTube Recovers from Recommendation Outage

YouTube resolved a brief global outage caused by a recommendation system failure that prevented videos from appearing. The issue affected all platforms including YouTube.com, apps, Music, Kids, and TV. Peak reports hit over 320,000 in the US per Downdetector, with impacts in multiple countries.

#service-outage#global-disruption
🗾
ITmedia AI+ (日本)48d ago

GitHub Unveils Copilot CLI Command Cheat Sheet

GitHub has compiled and explained slash commands for GitHub Copilot CLI in an official blog post. Developers can execute quick, repeatable actions in the terminal without switching to editors or web UI.

#slash-commands#terminal-ai#dev-cheatsheet
🗾
ITmedia AI+ (日本)48d ago

Gartner: Under 20 Humanoids in Production by 2028

Gartner predicts fewer than 20 companies will deploy humanoid robots in full production by 2028. The forecast focuses on manufacturing and supply chain sectors amid physical AI hype.

#physical-ai#humanoid-forecast#adoption-hype
🐯
虎嗅48d ago

AI Adopted by Billions in One Chinese Spring Festival

Chinese AI apps Qianwen, Doubao, and Yuanbao exploded during 2026 Spring Festival via red envelope campaigns, logging billions of interactions and onboarding over 130 million new users, including elderly and lower-tier city residents. This achieved unprecedented adoption speed, faster than smartphones (5 years) or mobile payments (3 years). The event marked AI's shift from niche to mainstream through habit-forming incentives.

#red-envelopes#user-adoption#china-market
🤖
Reddit r/MachineLearning48d ago

Snapdragon Chipsets Show 71-93% INT8 Accuracy Variance

Same INT8 ONNX model tested on 5 Snapdragon chipsets yields accuracy from 93% (8 Gen 3) to 71% (4 Gen 2), vs 94% cloud. Causes: NPU INT8 rounding differences, operator fusion variations, CPU fallbacks on low-end chips. Highlights need for hardware-specific on-device testing.

#quantization#npu#on-device
⚛️
量子位48d ago

Galaxy Star Brain Enables Real Robot Deployment

Galaxy Universal transitions robots from stage performances to practical on-the-job use via its end-to-end large model, Galaxy Star Brain. A capable working robot debuted at this year's Spring Festival Gala. The piece highlights the model's strength in real-world applications.

#robotics#end-to-end-model#embodied-ai
🔥
36氪48d ago

Tesla Avoids CA Sales Ban on FSD Marketing

California DMV confirms Tesla complied with marketing rules for Autopilot and Full Self-Driving, avoiding a 30-day sales ban. This follows a December judge's ruling on exaggerated claims, with Tesla given 90 days for corrections. The company implemented required corrective measures.

#autonomous-driving#california-dmv
🔥
36氪48d ago

California Probes xAI Grok Explicit Images

California AG Rob Bonta is launching an AI accountability program while investigating xAI's Grok for generating explicit pornographic images without consent, including potentially underage content. The office issued a cease-and-desist order last month amid global scrutiny. xAI deflects blame and still allows some sexualized content for paid users.

#regulation#content-safety#investigation
🧧
Qwen (GitHub Releases: qwen-code)48d ago

Qwen SDK TypeScript v0.1.5-preview.2 Released

Qwen released SDK TypeScript v0.1.5-preview.2, bundling CLI v0.10.2 with fixes for authentication, logging, and extension issues. New features include experimental skills settings, redesigned CLI UI, and removal of tiktoken dependency. Various docs updates and compatibility improvements enhance developer workflow.

#cli-redesign#skills-support#token-counts
Page 575 of 626