All Updates
Page 1237 of 1437
March 4, 2026
XPeng's 2026 AI Bet: No Plan B
XPeng faces sales pressure in early 2026 amid EV market slump, lagging competitors with 15,256 deliveries in Feb. Chairman He Xiaopeng criticizes L2 ADAS and pushes AI autonomy, launching second-gen VLA and X9 EV. Plans 7 super range-extender models to balance AI dreams with market realities.
Qwen3.5-35B-A3B Nears Claude Opus on SWE-bench Hard
Qwen3.5-35B-A3B, a 3B active param MoE model, achieved 37.8% on SWE-bench Verified Hard with a 'verify-on-edit' strategy, close to Claude Opus 4.6's 40%. This simple verification after each edit boosted performance from 22% baseline. Full benchmark hits 67%, rivaling larger models.
Proxy Adds Identity Policies for Clientless Devices
Cloudflareβs Gateway Authorization Proxy now supports identity-aware policies. It secures virtual desktops and guest networks without needing a device client. This shifts from basic access to badge-like verification.
Nametag Partnership Defeats Deepfakes
Cloudflare One partners with Nametag to combat laptop farms and AI-enhanced identity fraud. Identity verification is required during employee onboarding. Continuous authentication prevents insider threats.
Cloudflare One Adds User Risk Scoring
Cloudflare One now incorporates dynamic User Risk Scores into Access policies for automated, adaptive security responses. This moves teams beyond binary allow/deny rules by evaluating continuous behavior signals from internal and third-party sources.
AI Circle Shares Qwen Farewell Tweet Overnight
Overnight, the global AI community has been widely sharing a farewell tweet related to Qwen. Qwen Station has reached a new crossroads, sparking buzz across the AI world. The news was highlighted by Ifanr.
Micro LED CPO power at 5% of copper cables
TrendForce reports generative AI boom drives data center demand for high-speed interconnects. Copper cables struggle with density and energy; Micro LED CPO cuts power to 5% of copper. This positions it as a key alternative for intra-rack transmission.
Tokens Outvalue Human Labor
AI tools like Seedance produce viral videos for pennies, slashing human production costs and funneling wealth to compute providers. Humans risk becoming 'meat plugins' for AI via RentAHuman. Warns of 2028 white-collar crisis and systemic collapse.
Robot Recruiters for Care Workers?
AI tools are screening care workers for suitability. The article questions if robots can truly assess qualities needed for caring roles. BBC explores limitations in automated hiring.
Qwen Tech Lead Lin Junyang Resigns
Alibaba's Qwen technical lead Lin Junyang abruptly announced his departure on X after leading the model to global open-source dominance. Colleagues mourn the loss amid recent Qwen3.5 and Qwen3-Max releases, with two other key engineers also leaving. Potential successors include Alibaba's Zhou Jingren or DeepMind's Hao Zhou.
Phison Mandates Prepayments in Supply Crunch
Phison Electronics is requiring customers to prepay or shorten payment terms for SSD controller orders. Previously offered 3-month or longer credit periods are discontinued. This stems from upstream suppliers imposing prepayments on Phison amid rising supply chain financial pressures.
Windows 12: Modular, AI-First This Year?
Microsoft is reportedly preparing Windows 12 for release this year, featuring a fully modular design. AI will be the core experience, aligning with the company's AI-first operational shift.
SWE-Hub Unifies Scalable SWE Task Production
SWE-Hub is an end-to-end production system addressing data scarcity for software-engineering AI agents by automating environments, synthesizing bugs at scale, and generating diverse tasks. Key components include Env Agent for reproducible multi-language setups, SWE-Scale for high-throughput bug fixes, Bug Agent for system-level regressions, and SWE-Architect for repo-scale creation from natural language. It enables continuous delivery of executable tasks across the full SWE lifecycle.
NFR Patterns for Agentic AI Reliability
Revisits goals-to-aspects methodology for agentic AI, introducing 12 reusable patterns across security, reliability, observability, and cost management. Maps i* goal models to Rust AOP implementations, with agent-specific patterns like prompt injection detection. Validates via case study on open-source agent framework.
MicroVerse Launches Micro-World Simulations
Introduces MicroWorldBench, a benchmark with 459 expert criteria for microscale simulations across organ, cellular, and molecular levels. Reveals SOTA video models' failures in physics, consistency, and fidelity. Releases MicroSim-10K dataset and trains MicroVerse for accurate microscale reproductions.
M-JudgeBench Boosts Multimodal Judge Reliability
Introduces M-JudgeBench, a 10-dimensional benchmark assessing MLLM judges across pairwise CoT, length bias, and error detection. Proposes Judge-MCTS for generating diverse reasoning data to train superior M-Judger models. Experiments show M-Judger outperforming priors on benchmarks.
LOGIGEN: Logic-Driven Agent Task Generator
LOGIGEN is a framework that synthesizes verifiable training data for agentic LLMs using logic-driven methods and triple-agent orchestration. It generates 20,000 complex tasks across 8 domains with guaranteed validity via state equivalence checks. Models trained with SFT and RL achieve 79.5% success on ΟΒ²-Bench, far surpassing baselines.
LifeEval: Egocentric AI Assistance Benchmark
LifeEval introduces a multimodal benchmark for real-time, task-oriented human-AI collaboration in egocentric daily life. It features 4,075 QA pairs across 6 capability dimensions, emphasizing holistic evaluation, real-time perception from first-person streams, and natural dialogues. Evaluations of 26 MLLMs reveal major challenges in adaptive interaction.
IRIS: UMLLM Fairness Benchmark Launch
IRIS Benchmark is the first to synchronously evaluate fairness in UMLLMs' understanding and generation tasks. Powered by ARES classifier and four datasets, it aggregates 60 metrics into a high-dimensional 'fairness space' across IRIS dimensions. It uncovers biases like 'generation gap' and 'personality splits' in leading models.
HealHGNN Masters Heterophilic Hypergraphs
HealHGNN introduces heterophily-agnostic message passing for hypergraph neural networks using Riemannian geometry. It mitigates oversquashing via adaptive local exchangers based on manifold heat flow, capturing long-range dependencies with Robin conditions and source terms. The model achieves SOTA performance on both homophilic and heterophilic datasets with linear complexity.