🐯虎嗅•Stalecollected in 16m
Anthropic Safety Crumbles in Epic Leak

💡Anthropic's safety hypocrisy: leaked files + policy flip exposes AI survival game
⚡ 30-Second TL;DR
What Changed
CMS permission error exposed 3000 files including safety reports and Mythos details
Why It Matters
Erodes trust in Anthropic's safety leadership as policy softens under competitive and gov pressures. Highlights AI firms' vulnerability to basic security lapses despite expertise.
What To Do Next
Audit your S3/CMS bucket permissions immediately to prevent public exposure.
Who should care:Founders & Product Leaders
🧠 Deep Insight
AI-generated analysis for this event.
🔑 Enhanced Key Takeaways
- •The leaked 'Mythos' project is identified in internal documentation as a specialized, high-compute reasoning model designed specifically for autonomous agentic workflows, distinct from the standard Claude 3.5/4 architecture.
- •The transition to RSP 3.0 (Responsible Scaling Policy) was reportedly accelerated by internal friction between Anthropic's safety-focused 'Constitutional AI' team and the commercial division, which argued that hard training pauses were causing a 15% latency in model deployment cycles compared to OpenAI's O-series.
- •The preliminary injunction against the DoD centers on a specific interpretation of the 'dual-use' clause in the 2025 AI Security Act, with the court ruling that Anthropic's refusal to provide 'backdoor' access for offensive cyber operations does not constitute a breach of national security contracts.
🔮 Future ImplicationsAI analysis grounded in cited sources
Anthropic will face increased regulatory scrutiny regarding the 'Mythos' model's autonomous capabilities.
The leak revealed that Mythos possesses advanced self-correction loops that were previously undisclosed to the AI Safety Institute.
The shift to periodic transparency reports will lead to a decline in public trust metrics for Anthropic.
Industry analysts suggest that replacing hard safety gates with retrospective reporting creates a 'reactive' rather than 'proactive' safety posture.
⏳ Timeline
2023-09
Anthropic publishes initial Responsible Scaling Policy (RSP 1.0).
2024-06
Anthropic updates RSP to 2.0, formalizing hard training pauses for ASL-3 models.
2025-11
Anthropic enters into a contested partnership agreement with the Department of Defense.
2026-02
Anthropic officially adopts RSP 3.0, shifting from hard pauses to transparency-based monitoring.
2026-03
CMS misconfiguration leads to the unauthorized exposure of 3,000 internal files.
📰 Event Coverage
📰
Weekly AI Recap
Read this week's curated digest of top AI events →
👉Related Updates
AI-curated news aggregator. All content rights belong to original publishers.
Original source: 虎嗅 ↗