OpenAI strengthens ChatGPT mental health guardrails: 6 things to know

Advertisement

OpenAI has updated ChatGPT’s default GPT-5 model to better recognize signs of mental distress, de-esculate sensitive conversations and direct users toward professional support when appropriate. The changes are part of a broader effort to reduce unsafe responses and address high-risk interactions among those with mental health conditions.  

Six things to know:

  1. The model improvements focus on three areas: psychosis and mania, self-harm and suicide, and emotional reliance on AI, according to an Oct. 27 report.
  1. The latest model returns responses that do not comply with safety guidelines about 65% to 80% less often across mental health-related domains, based on internal taxonomies and expert review. Taxonomies are detailed guides that explain properties of sensitive conversations and what ideal and undesired model behavior looks like. 
  1. Although mental health-related conversations are rare — affecting an estimated 0.07% to 0.15% of weekly active users — OpenAI uses high-risk test prompts, not just real-world data, to evaluate performance. 
  1. More than 170 clinicians from OpenAI’s Global Physician Network received over 1,800 model responses and contributed guidance on safer behaviors in distress-related conversations. 
  1. GPT-5 reduced undesired responses by 52% for suicide and self-harm and 42% for emotional reliance compared to GPT-4o, according to expert evaluations. 
  1. OpenAI updated its principles for how models should behave, to clarify that models should avoid affirming delusions, respond empathetically to signs of mental distress, and encourage users to seek support from real-world relationships and professionals.
Advertisement

Next Up in AI

Advertisement