top of page

AI Therapy? Research Shows ChatGPT Outperforms Empathy Experts


The results of this study suggest that AI has the potential to improve therapeutic processes and even outperform human professionals in some specific aspects, such as accessibility and consistency in delivering answers. However, there are still challenges to be overcome, including ethical issues, clinical supervision and the need to adapt AI responses to the emotional and individual context of each patient.


The idea that machines can think has been explored for decades, ever since Alan Turing proposed the famous "imitation game". In this theoretical experiment, a human interrogator asks questions to two entities, a human and a machine, without knowing who is who, and must identify which of the answers belongs to a real person.


Turing predicted that, by the year 2000, interrogators would only get it right 70% of the time, which would indicate that machines could imitate human language with great success. Less than 20 years after his prediction, ELIZA appeared, one of the first programs capable of simulating a therapeutic conversation.

Alan Turing and the Turing machine


Created to respond like a psychotherapist, ELIZA used a conversational style based on question rephrasing and emotional validation, which led many people to believe they were interacting with a real professional. This showed that, with few resources, it was possible to create an illusion of understanding.


Since then, artificial intelligence has advanced considerably, and today generative AI chatbots, such as ChatGPT, are being studied as auxiliary tools in psychotherapy.


Some researchers still question whether robots can actually establish a genuine therapeutic relationship, but recent evidence suggests that AI can be useful in this context. Studies have shown that AI can complement human care and even provide psychological support independently.


For example, the virtual assistant HAILEY was used to provide immediate feedback on an emotional support platform called TalkLife. About 63% of the responses generated with the help of AI were rated as equally or more empathetic than responses written by humans.


Another study looked at using ChatGPT 3.5 to help in the workplace by creating messages of gratitude and appreciation. Interestingly, people preferred the AI-generated messages, which were longer, more positive, and linguistically richer than the ones written manually.

When AI is used autonomously to answer medical or psychological questions, the results are equally impressive. In one experiment, doctors evaluated ChatGPT responses to questions taken from Reddit and preferred the AI-generated responses 78.6% of the time, rating them as more empathetic than responses provided by human professionals.


Similarly, relationship experts compared AI responses with responses from therapists and found that people often couldn’t tell the difference.


Furthermore, therapists who reviewed sessions conducted by ChatGPT without knowing they were generated by AI rated the quality of the interactions as high.


When interviewed about their experiences, participants who received AI support found the responses to be thorough and appropriate for exploring relationship issues. Based on these findings, there is growing interest in using AI for online couples therapy interventions.


Over the past 20 years, a number of digital programs have been developed to help couples improve their relationships. Some of these programs, such as OurRelationship and ePREP, have been shown to have positive effects on couple satisfaction, especially among ethnic, racial, and sexual minorities.


These interventions often use human coaches to guide participants, but AI can be an effective alternative because it does not suffer from the same time and availability constraints.

An evidence-based AI chatbot could provide ongoing support that is accessible to anyone with an internet connection. However, human oversight would still be essential to ensure the safety and effectiveness of these interactions.


Despite advances, current research still has some limitations. In studies comparing AI to human therapists, the human experts were unaware that they were being evaluated against AI, and any given therapist has limited knowledge compared to the vast database of information that an AI model can access.


In addition, many transcripts of therapy sessions are confidential, making it difficult to analyze linguistic patterns that could help understand what makes a therapeutic response effective.


Another important point is that most studies focus on the ability of AI to generate empathetic responses, but therapy involves other crucial factors, such as building a strong therapeutic relationship, setting realistic expectations for treatment, and cultural competence to deal with diverse client profiles.

A broader issue in psychotherapy is that the effectiveness of evidence-based interventions appears to have plateaued. The effects of therapies are not improving significantly over time, raising the possibility that AI could be a tool to innovate in the field.


AI models could help create new therapeutic approaches or improve existing ones, increasing the impact of psychological treatments.


To explore this further, a recent study directly compared the responses of 13 experienced therapists with responses generated by ChatGPT 4.0 in 18 couples therapy scenarios.


A group of 830 participants were tasked with distinguishing between the AI’s responses and those of the therapists. As predicted by Turing, the participants had great difficulty correctly identifying which response belonged to a human and which to the AI.


Furthermore, ChatGPT’s responses were often rated as more aligned with core principles of therapy, such as empathy and clarity. Linguistic differences were also identified: the AI's responses were longer, used a wider variety of words, and had a more positive tone.

These results suggest that AI has the potential to improve therapeutic processes and even outperform human professionals in some specific aspects, such as accessibility and consistency in delivering responses.


However, there are still challenges to be overcome, including ethical issues, clinical oversight, and the need to tailor AI responses to each patient’s individual emotional context.


Future research could focus on refining the way AI interacts with users, ensuring that it is an effective and safe tool for those seeking emotional and therapeutic support.



READ MORE:


When ELIZA meets therapists: A Turing test for the heart and mind

S. Gabe Hatch, Zachary T. Goodman, Laura Vowels, H. Dorian Hatch, Alyssa L. Brown, Shayna Guttman, Yunying Le, Benjamin Bailey, Russell J. Bailey, Charlotte R. Esplin, Steven M. Harris, D. Payton Holt Jr., Merranda McLaughlin, Patrick O’Connell, Karen Rothman, Lane Ritchie, D. Nicholas Top Jr., and

Scott R. Braithwaite

PLOS Ment Health 2 (2): e0000145. https://doi.org/10.1371/journal.pmen.0000145


Abstract:


Can machines be therapists?” is a question receiving increased attention given the relative ease of working with generative artificial intelligence. Although recent (and decades-old) research has found that humans struggle to tell the difference between responses from machines and humans, recent findings suggest that artificial intelligence can write empathically and the generated content is rated highly by therapists and outperforms professionals. It is uncertain whether, in a preregistered competition where therapists and ChatGPT respond to therapeutic vignettes about couple therapy, a) a panel of participants can tell which responses are ChatGPT-generated and which are written by therapists (N = 13), b) the generated responses or the therapist-written responses fall more in line with key therapy principles, and c) linguistic differences between conditions are present. In a large sample (N = 830), we showed that a) participants could rarely tell the difference between responses written by ChatGPT and responses written by a therapist, b) the responses written by ChatGPT were generally rated higher in key psychotherapy principles, and c) the language patterns between ChatGPT and therapists were different. Using different measures, we then confirmed that responses written by ChatGPT were rated higher than the therapist’s responses suggesting these differences may be explained by part-of-speech and response sentiment. This may be an early indication that ChatGPT has the potential to improve psychotherapeutic processes. We anticipate that this work may lead to the development of different methods of testing and creating psychotherapeutic interventions. Further, we discuss limitations (including the lack of the therapeutic context), and how continued research in this area may lead to improved efficacy of psychotherapeutic interventions allowing such interventions to be placed in the hands of individuals who need them the most.

Comments


© 2020-2025 by Lidiane Garcia

bottom of page