ChatGPT Health Misses Medical Emergencies

Post LinkedIn

🇬🇧Read original on The Guardian Technology

#ai-safety #healthcare-ai #llm-riskschatgpt-health

💡ChatGPT Health misses >50% emergencies—vital safety alert for medical AI builders.

⚡ 30-Second TL;DR

What Changed

Failed hospital recommendation in >50% emergency cases

Why It Matters

Highlights severe limitations of LLMs in high-stakes health advice, urging caution for AI deployments in medicine. Could trigger regulatory reviews and erode trust in OpenAI's consumer health tools.

What To Do Next

Test your LLM health bots with emergency and suicide prompts from this study's benchmarks.

Who should care:Researchers & Academics

🧠 Deep Insight

Web-grounded analysis with 6 cited sources.

🔑 Enhanced Key Takeaways

•Mount Sinai researchers tested ChatGPT Health across 960 interactions using 60 scenarios from 21 medical specialties, varying factors like race, gender, and barriers to care such as lack of insurance.
•The tool performed adequately in obvious emergencies like stroke or severe allergic reactions but under-triaged nuanced cases, such as early respiratory failure in asthma where it advised waiting despite recognizing warning signs.
•OpenAI updated ChatGPT’s default GPT-5 model in October 2025 to better recognize mental distress, de-escalate conversations, and direct users to professional support.
•The Mount Sinai study, published February 23, 2026, in Nature Medicine, marks the first independent safety review of ChatGPT Health since its January 2026 launch.

🔮 Future ImplicationsAI analysis grounded in cited sources

Independent evaluations of consumer AI health tools will become mandatory by 2027

The Mount Sinai study highlights unprecedented safety concerns in a widely used tool with 40M daily users, prompting calls for regulatory oversight as AI evolves.

ChatGPT Health accuracy will improve to match physicians in textbook emergencies by late 2026

Researchers plan ongoing evaluations of updated versions, building on prior GPT-4 successes in ED diagnostics while addressing current triage blind spots.

⏳ Timeline

2024-01

OpenAI launches initial ChatGPT health advice feature integrating medical records and apps

2024-01

JMIR study shows GPT-4 outperforming GPT-3.5 and ED residents in internal medicine diagnostic accuracy

2024-10

UCSF study finds ChatGPT-4 overprescribes in ED scenarios, 8% less accurate than residents

2025-10

OpenAI updates GPT-5 model to enhance mental distress recognition and user de-escalation

2026-01

OpenAI launches ChatGPT Health, reaching 40M daily users within weeks

2026-02

Mount Sinai publishes first independent study in Nature Medicine revealing ChatGPT Health triage failures

📎 Sources (6)

Factual claims are grounded in the sources below. Forward-looking analysis is AI-generated interpretation.

🇬🇧Read original article on The Guardian Technology

📰

Weekly AI Recap

Read this week's curated digest of top AI events →

👉Related Updates

Same topic

Explore #ai-safety

Same product

Scientist wins $100,000 prize for decoding zebra finch language

The Guardian Technology•Jun 26

AI-curated news aggregator. All content rights belong to original publishers.
Original source: The Guardian Technology ↗