All Updates

Page 1237 of 1437

March 4, 2026

🐯
Huxiu (θ™Žε—…)β€’118d ago

XPeng's 2026 AI Bet: No Plan B

XPeng faces sales pressure in early 2026 amid EV market slump, lagging competitors with 15,256 deliveries in Feb. Chairman He Xiaopeng criticizes L2 ADAS and pushes AI autonomy, launching second-gen VLA and X9 EV. Plans 7 super range-extender models to balance AI dreams with market realities.

#autonomous-driving#ev-market#china-auto
πŸ¦™
Reddit r/LocalLLaMAβ€’118d ago

Qwen3.5-35B-A3B Nears Claude Opus on SWE-bench Hard

Qwen3.5-35B-A3B, a 3B active param MoE model, achieved 37.8% on SWE-bench Verified Hard with a 'verify-on-edit' strategy, close to Claude Opus 4.6's 40%. This simple verification after each edit boosted performance from 22% baseline. Full benchmark hits 67%, rivaling larger models.

#moe-model#agent-verification
πŸ›‘οΈ
Cloudflare Blogβ€’118d ago

Proxy Adds Identity Policies for Clientless Devices

Cloudflare’s Gateway Authorization Proxy now supports identity-aware policies. It secures virtual desktops and guest networks without needing a device client. This shifts from basic access to badge-like verification.

#zero-trust#vdi#clientless
πŸ›‘οΈ
Cloudflare Blogβ€’118d ago

Nametag Partnership Defeats Deepfakes

Cloudflare One partners with Nametag to combat laptop farms and AI-enhanced identity fraud. Identity verification is required during employee onboarding. Continuous authentication prevents insider threats.

#deepfake#identity-fraud#onboarding
πŸ›‘οΈ
Cloudflare Blogβ€’118d ago

Cloudflare One Adds User Risk Scoring

Cloudflare One now incorporates dynamic User Risk Scores into Access policies for automated, adaptive security responses. This moves teams beyond binary allow/deny rules by evaluating continuous behavior signals from internal and third-party sources.

#risk-scoring#zero-trust#behavioral-analytics
πŸ“±
Ifanr (ηˆ±θŒƒε„Ώ)β€’118d ago

AI Circle Shares Qwen Farewell Tweet Overnight

Overnight, the global AI community has been widely sharing a farewell tweet related to Qwen. Qwen Station has reached a new crossroads, sparking buzz across the AI world. The news was highlighted by Ifanr.

#farewell-tweet#ai-community#platform-shift
πŸ”₯
36ζ°ͺβ€’118d ago

Micro LED CPO power at 5% of copper cables

TrendForce reports generative AI boom drives data center demand for high-speed interconnects. Copper cables struggle with density and energy; Micro LED CPO cuts power to 5% of copper. This positions it as a key alternative for intra-rack transmission.

#data-center#optical-interconnect#energy-efficiency
🐯
θ™Žε—…β€’118d ago

Tokens Outvalue Human Labor

AI tools like Seedance produce viral videos for pennies, slashing human production costs and funneling wealth to compute providers. Humans risk becoming 'meat plugins' for AI via RentAHuman. Warns of 2028 white-collar crisis and systemic collapse.

#ai-economics#job-automation#content-tools
πŸ‡¬πŸ‡§
BBC Technologyβ€’118d ago

Robot Recruiters for Care Workers?

AI tools are screening care workers for suitability. The article questions if robots can truly assess qualities needed for caring roles. BBC explores limitations in automated hiring.

#robot-recruiter#ai-hiring#care-workers
🐯
θ™Žε—…β€’118d ago

Qwen Tech Lead Lin Junyang Resigns

Alibaba's Qwen technical lead Lin Junyang abruptly announced his departure on X after leading the model to global open-source dominance. Colleagues mourn the loss amid recent Qwen3.5 and Qwen3-Max releases, with two other key engineers also leaving. Potential successors include Alibaba's Zhou Jingren or DeepMind's Hao Zhou.

#leadership-change#llm-strategy#alibaba-ai
πŸ‡¨πŸ‡³
cnBeta (Full RSS)β€’118d ago

Phison Mandates Prepayments in Supply Crunch

Phison Electronics is requiring customers to prepay or shorten payment terms for SSD controller orders. Previously offered 3-month or longer credit periods are discontinued. This stems from upstream suppliers imposing prepayments on Phison amid rising supply chain financial pressures.

#supply-chain#ssd#payment-terms
πŸ‡¨πŸ‡³
cnBeta (Full RSS)β€’118d ago

Windows 12: Modular, AI-First This Year?

Microsoft is reportedly preparing Windows 12 for release this year, featuring a fully modular design. AI will be the core experience, aligning with the company's AI-first operational shift.

#os-update#modular-design#ai-integration
πŸ“„
ArXiv AIβ€’118d ago

SWE-Hub Unifies Scalable SWE Task Production

SWE-Hub is an end-to-end production system addressing data scarcity for software-engineering AI agents by automating environments, synthesizing bugs at scale, and generating diverse tasks. Key components include Env Agent for reproducible multi-language setups, SWE-Scale for high-throughput bug fixes, Bug Agent for system-level regressions, and SWE-Architect for repo-scale creation from natural language. It enables continuous delivery of executable tasks across the full SWE lifecycle.

#swe-agents#data-factory#bug-synthesis
πŸ“„
ArXiv AIβ€’118d ago

NFR Patterns for Agentic AI Reliability

Revisits goals-to-aspects methodology for agentic AI, introducing 12 reusable patterns across security, reliability, observability, and cost management. Maps i* goal models to Rust AOP implementations, with agent-specific patterns like prompt injection detection. Validates via case study on open-source agent framework.

#agentic-ai#aop#nfr-patterns
πŸ“„
ArXiv AIβ€’118d ago

MicroVerse Launches Micro-World Simulations

Introduces MicroWorldBench, a benchmark with 459 expert criteria for microscale simulations across organ, cellular, and molecular levels. Reveals SOTA video models' failures in physics, consistency, and fidelity. Releases MicroSim-10K dataset and trains MicroVerse for accurate microscale reproductions.

#video-gen#biomedical-benchmark
πŸ“„
ArXiv AIβ€’118d ago

M-JudgeBench Boosts Multimodal Judge Reliability

Introduces M-JudgeBench, a 10-dimensional benchmark assessing MLLM judges across pairwise CoT, length bias, and error detection. Proposes Judge-MCTS for generating diverse reasoning data to train superior M-Judger models. Experiments show M-Judger outperforming priors on benchmarks.

#multimodal-judge#mcts-data#cot-benchmark
πŸ“„
ArXiv AIβ€’118d ago

LOGIGEN: Logic-Driven Agent Task Generator

LOGIGEN is a framework that synthesizes verifiable training data for agentic LLMs using logic-driven methods and triple-agent orchestration. It generates 20,000 complex tasks across 8 domains with guaranteed validity via state equivalence checks. Models trained with SFT and RL achieve 79.5% success on τ²-Bench, far surpassing baselines.

#agentic-ai#data-synthesis#rl-training
πŸ“„
ArXiv AIβ€’118d ago

LifeEval: Egocentric AI Assistance Benchmark

LifeEval introduces a multimodal benchmark for real-time, task-oriented human-AI collaboration in egocentric daily life. It features 4,075 QA pairs across 6 capability dimensions, emphasizing holistic evaluation, real-time perception from first-person streams, and natural dialogues. Evaluations of 26 MLLMs reveal major challenges in adaptive interaction.

#multimodal-benchmark#egocentric-ai
πŸ“„
ArXiv AIβ€’118d ago

IRIS: UMLLM Fairness Benchmark Launch

IRIS Benchmark is the first to synchronously evaluate fairness in UMLLMs' understanding and generation tasks. Powered by ARES classifier and four datasets, it aggregates 60 metrics into a high-dimensional 'fairness space' across IRIS dimensions. It uncovers biases like 'generation gap' and 'personality splits' in leading models.

#fairness#bias#multimodal
πŸ“„
ArXiv AIβ€’118d ago

HealHGNN Masters Heterophilic Hypergraphs

HealHGNN introduces heterophily-agnostic message passing for hypergraph neural networks using Riemannian geometry. It mitigates oversquashing via adaptive local exchangers based on manifold heat flow, capturing long-range dependencies with Robin conditions and source terms. The model achieves SOTA performance on both homophilic and heterophilic datasets with linear complexity.

#riemannian-geometry#heterophily
Page 1237 of 1437