Empathy in the Age of AI
The last time we had articles that specifically addressed empathy was back in 2023. I wrote one of those articles, and one of our students responded to it. This was long before we were thinking much about AI in medicine. Today, I am proud to feature an article by Dr. John Lantos, updating our discussions.
Adam Cifu
For years, thoughtful physicians assumed that artificial intelligence would handle the drudgery of medicine—the documentation, the billing, the data entry—and that this would free doctors to be more humanistic. The machines, by this view, could never be empathic. Eric Topol’s 2019 book on AI in healthcare captured this optimism in its subtitle: How Artificial Intelligence Can Make Healthcare Human Again.
This was before the widespread use of large language models (LLMs).
2023 saw the release of ChatGPT4 and a study in JAMA Internal Medicine that changed our understanding of machine-produced empathy. In the study, patients were shown responses to their medical questions—some written by physicians, some by an AI chatbot—and asked to rate them for empathy. The chatbot won. Such findings were replicated in oncology and patient portal research. Topol revised his thinking: maybe AI wasn’t just a tool that could free up physicians to be empathic. Maybe it was modeling empathic communication in ways that could teach us something.
The implications are unsettling. If a machine with no inner life, no history with the patient, and no stake in the outcome can be perceived as more empathic than a trained physician, what exactly is empathy? Can we measure it? Can we teach it?
Empathy Is Not One Thing
Philosopher and psychiatrist Jodi Halpern argues that empathy is not simply a communication skill or emotional resonance but a form of emotional reasoning — a disciplined use of imagination, informed by medical knowledge, careful listening, and genuine attunement to what illness means in a patient’s life. That’s a much richer concept than what studies of AI empathy typically assess.
Leslie Jamison, who worked as a “medical actor” for student examinations long before chatbots, noticed that there was a gap between the sort of performative empathy that she was tasked to evaluate, and the deeper sort of empathy that she sought as a patient. The grading criteria that she was instructed to apply, she wrote, rewarded shallow expressions of understanding. To Jamison, those felt worse than nothing. “Empathy isn’t just remembering to say that must really be hard — it’s figuring out how to bring difficulty into the light so it can be seen. Empathy requires inquiry as much as imagination.”
Most contemporary researchers break empathy into four components. Affective empathy is the capacity to feel something of what another person feels — emotional resonance. Cognitive empathy is the intellectual understanding of another’s perspective, attainable without necessarily sharing their feelings. Communicative empathy is the ability to express that understanding in legible ways — through words, tone, and gesture — so the other person knows they’ve been seen. And moral empathy is a responsiveness to another person’s conception of their own suffering, one that genuinely reshapes how you respond to them.
These are not four faces of the same thing. They are partially independent capacities, each of which can exist without the others. A doctor can have strong cognitive empathy and poor communicative skill, or robust affective resonance with almost no ability to express it. Crucially, a system can have fluent communicative empathy with zero moral engagement. That last combination is exactly what a well-trained large language model produces.
What AI Reveals
The findings that AI is perceived as empathic tell us more about the erroneous ways we have been measuring empathy than they do about the superiority of machines. Standard empathy assessments, of the sort that Jamison criticizes, mostly capture the communicative layer: a warm tone, acknowledging gestures, and verbal validation of emotions. An LLM trained on millions of human conversations can score well on these metrics because it has learned the form. But form without foundation is not enough.
Communicative empathy without underlying moral engagement is hollow performance. It can satisfy patients in brief textual exchanges. But assessing only that, and claiming that machines outperform humans, endorses shallow reductionism. A more robust assessment would not just examine patient perceptions of a brief, typed query. Such an assessment would ask whether the chatbots can sustain the ongoing interpretive work of longitudinal care, or help patients navigate the challenges of clinical uncertainty, or just be present when patients must face bad news.
There is a categorical difference that the metrics obscure. An AI system processes inputs and generates outputs; nothing is at stake for it. It cannot be moved. A physician who could be moved but chooses detachment is failing. These are different situations for the patient, even when the outputs look identical in a single encounter. As Harari put it, “The deepest level of human relationships is not the desire for someone to care about my feelings; it’s the opposite ... It’s the desire to care about their feelings as well.”
The Training Problem
This distinction matters for medical education. We know that empathy erodes during medical school, with the sharpest drop in the third year, precisely when students transition from classroom to clinic. The pattern continues through residency.
The causes are structural. The hidden curriculum teaches emotional detachment as professionalism. Role models demonstrate efficiency over presence. Evaluation systems reward diagnostic accuracy and procedural competence while treating relational quality as unmeasurable or secondary. Students enter medicine with above-average empathy and idealism; training systematically degrades both.
Osler famously counseled students to seek equanimity — a kind of emotional distance — rather than empathic engagement. Osler’s approach has become the standard. Doctors are trained to avoid deep emotional connections with patient’s suffering, even as they get bad grades of they don’t learn to express shallow endorsements of that suffering. Empathy is thus celebrated in theory even as it is extinguished in practice. Students receive a mixed message and resolve it the way overworked people always do, by responding as required to get their rewards, and protecting themselves with cynicism.
Prior attempts to counter this through curricular add-ons — a humanities elective here, a communication module there — have mostly failed, not because the content was wrong but because it left untouched the environment that erodes what the curriculum cultivates. You cannot teach empathy on Tuesday and train it out of students the rest of the week.
How to Teach
The emergence of empathic AI forces a clarifying question on medical education: What does human empathy do that a machine cannot? If our answer is, “By our current metrics, nothing,” that is a problem with our metrics that will ensure our rapid obsolescence in a world of skilled chatbots. It is not a discovery about human nature.
The deeper forms of empathy that sustain clinical practice — interpretive understanding, moral attention, the willingness to be genuinely changed by a patient’s reality — are precisely the skills that cannot be automated. The challenge is not to compete with machines in scripted empathy. It is to teach the forms that no machine can simulate. Those are not so easy to teach.
One model for such teaching comes from hospital chaplaincy. Chaplains understand that listening is not an optional adjunct to care. It is a core competency, a sophisticated skill that must be cultivated, practiced, and refined. Chaplains in training learn through a methodology called Action/Reflection/Action. The chaplaincy student visits a patient, then reconstructs the encounter in exhaustive detail in a verbatim report. Then, together with supervisors and other students, the chaplains reflect on what was heard, what was missed, and what in the student’s own history shaped their listening. The goal is for the practitioner to attend fully to a patient’s emotional reality, to monitor, in real time, how those emotions affect the practitioner, and to critically evaluate one’s responses.
Such training takes time. The curriculum is already full. To prioritize this would require de-prioritizing other competencies. But many of those other competencies are for things that the machines do better.
AI’s challenge to the medical profession requires us to think deeply about what it means to care for patients. The machines sound like they care. The question is whether we can help humans care for real.
John D. Lantos, MD is a pediatrician and bioethicist in New Haven, Connecticut. He is working on a book about doctor-patient communication, tentatively titled, “Just Listening.”


