All Updates

Page 842 of 859

February 13, 2026

πŸ“„
ArXiv AIβ€’74d ago

AT-RL Reinforces MLLM Anchors for Reasoning

AT-RL selectively reinforces high-connectivity cross-modal anchor tokens (15% of total) in MLLM RLVR via attention graph clustering. 32B model hits 80.2% on MathVista, beating 72B baseline with 1.2% overhead. Low-connectivity training degrades performance.

#research#mllm#at-rl
πŸ“„
ArXiv AIβ€’74d ago

ARC Learns Dynamic Agent Configurations

ARC introduces a reinforcement learning policy to dynamically configure LLM-based agent systems per query, selecting optimal workflows, tools, and prompts. It outperforms fixed templates on reasoning and tool-augmented QA benchmarks. The approach boosts accuracy by up to 25% while cutting token and runtime costs.

#research#arc#llm-agents
πŸ“„
ArXiv AIβ€’74d ago

AIR Boosts LLM Agent Safety

AIR is the first incident response framework for LLM agents, focusing on detecting, containing, recovering from, and eradicating incidents post-occurrence. It integrates a domain-specific language into the agent's execution loop for autonomous management. Evaluations across agent types show over 90% success rates in all phases.

#research#air#llm-agents
πŸ“„
ArXiv AIβ€’74d ago

AgentLeak: Multi-Agent Privacy Leak Benchmark

AgentLeak introduces the first full-stack benchmark for privacy leakage in multi-agent LLM systems, covering internal channels like inter-agent messages. It spans 1,000 scenarios across healthcare, finance, legal, and corporate domains. Tests on top models show internal channels cause 68.9% total leakage, missed by output audits.

#research#agentleak#multi-agent
πŸ‡¨πŸ‡³
cnBeta (Full RSS)β€’74d ago

Gemini 3 Deep Think Dominates Programming

Gemini 3 Deep Think receives a major upgrade, achieving state-of-the-art results across domains, especially programming. Only 7 people globally outperform it. Google VP shares it as a side project.

#update#google-gemini#gemini-3
πŸ‡¨πŸ‡³
cnBeta (Full RSS)β€’74d ago

Gemini 3 Deep Think Dominates Coding

Gemini 3 Deep Think upgrade achieves SOTA across domains, especially programming where only 7 people worldwide outperform it. This Google VP side project marks a new era in AI reasoning. It showcases unprecedented inference capabilities.

#update#gemini#gemini-3
πŸ‡¨πŸ‡³
cnBeta (Full RSS)β€’74d ago

Ex-Researcher Warns on ChatGPT Ads

Former OpenAI researcher ZoΓ« Hitzig warns ads in ChatGPT risk user manipulation like Facebook. She left after ad testing amid privacy concerns from user-shared intimate thoughts. Now at Harvard.

#research#openai-chatgpt#advertising
πŸ‡¨πŸ‡³
cnBeta (Full RSS)β€’74d ago

Ex-Researcher Warns ChatGPT Ads Risks

Former OpenAI researcher ZoΓ« Hitzig quit after testing ChatGPT ads, warning of manipulation risks from users' private data. She compares it to Facebook's pitfalls. Now at Harvard, she urges caution on ad systems.

#research#chatgpt#advertising
πŸ‡¨πŸ‡³
cnBeta (Full RSS)β€’74d ago

OpenAI Adds Ads to ChatGPT

OpenAI is launching ads on ChatGPT this week amid billions in funding needs. CEO Sam Altman previously opposed ads, calling them a last resort that could erode user trust. This shift aims to bolster revenue for the AI leader.

#update#openai#chatgpt
πŸ‡¨πŸ‡³
cnBeta (Full RSS)β€’74d ago

Gemini 3 Deep Think: Sketch to 3D

Google unveiled a major upgrade to Gemini 3 Deep Think, a reasoning model for science, research, and engineering. Google AI Ultra subscribers can access it now in the Gemini App. Early API access is open to select researchers, engineers, and enterprises.

#update#google#gemini-3
πŸ‡¨πŸ‡³
cnBeta (Full RSS)β€’74d ago

Gemini 3 Deep Think: Sketch-to-3D Upgrade

Google announced a major upgrade to Gemini 3 Deep Think, a reasoning model for science, research, and engineering. Google AI Ultra subscribers can access it via the Gemini App. Early API access is open to select researchers, engineers, and enterprises.

#update#google-gemini#3-deep-think
πŸ‡¨πŸ‡³
cnBeta (Full RSS)β€’74d ago

Anthropic Targets OpenAI in Super Bowl Ads

AI companies with deep user privacy access are rushing to monetize via ads amid lax regulation. Anthropic's Super Bowl ads satirized OpenAI's vulnerabilities without naming them. This highlights reliance on corporate ethics to prevent privacy abuse.

#anthropic#advertising#privacy
πŸ‡¬πŸ‡§
BBC Technologyβ€’74d ago

AI Safety Leader Quits for Poetry

An AI safety leader warns of global peril and resigns to study poetry. This coincides with an OpenAI researcher quitting over ChatGPT ad testing plans.

#resignation#openai#ai-safety
πŸ‡¬πŸ‡§
BBC Technologyβ€’74d ago

AI Safety Chief Quits, Cites Global Peril

A prominent AI safety leader resigned, warning the world is in peril, to study poetry. This follows an OpenAI researcher's exit over plans to test ChatGPT ads. The moves highlight tensions in AI development and commercialization.

#chatgpt#ai-safety#advertising
πŸ‡¬πŸ‡§
The Register - AI/MLβ€’74d ago

Samsung First Ships HBM4 Memory

Samsung claims first to ship HBM4 memory, a day after Micron's announcement. HBM4 provides faster, denser RAM for next-gen AI hardware. This aligns with Nvidia's Vera Rubin GPU timeline.

#launch#samsung#hbm4
πŸ•ΈοΈ
LangChain Blogβ€’74d ago

Agent Frameworks Essential Despite LLM Gains

Discusses if agent frameworks remain necessary as LLMs improve. Argues building approaches evolve but agents are fundamentally systems around models. Emphasizes ongoing role of frameworks like LangChain.

#research#langchain#ai-agents
πŸ•ΈοΈ
LangChain Blogβ€’74d ago

Agent Frameworks Remain Vital Amid LLM Advances

Explores whether agent frameworks are still necessary as LLMs improve. Notes that optimal agent-building approaches evolve with model performance. Emphasizes agents as systems built around models, highlighting observability.

#research#langchain#ai-agents
πŸ‡­πŸ‡°
SCMP Technologyβ€’74d ago

MiniMax Drops Affordable M2 AI Model

MiniMax released an updated M2 large language model for real-world productivity. The cheap AI follows rivals' launches in China's intense AI race. In-house benchmarks claim strong performance.

#update#minimax#m2
πŸ’»
ZDNet AIβ€’74d ago

AI Advances via Compute, Not Smarts

MIT report shows frontier models like OpenAI's GPT rely on more computing power rather than smarter algorithms. This scaling approach drives progress but hikes costs. The trend raises questions on sustainability.

#research#mit#openai
πŸ“±
Ifanr (ηˆ±θŒƒε„Ώ)β€’74d ago

Xiaomi Open-Sources Robot VLA Model

Xiaomi open-sources its first-generation VLA large model for robotics. Part of morning tech news roundup alongside OpenAI updates.

#research#xiaomi#vla
Page 842 of 859