Evaluation of artificial intelligence-based patient education models for irritable bowel syndrome

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Irritable bowel syndrome (IBS) is a common functional gastrointestinal disorder with a significant psycho-social burden. Despite medical advancements, patient education on IBS remains inadequate. This study compared two large language models (LLMs)—ChatGPT-4 and Gemini-1—for their performance in addressing IBS-related patient queries. Methods: Thirty-nine IBS-related frequently asked questions (FAQs) from IBS organizations and hospital websites were categorized into six domains: general understanding, symptoms and diagnosis, causes, dietary considerations, treatment and lifestyle factors. Responses from ChatGPT-4 and Gemini-1 were evaluated by two independent gastroenterologists for comprehensiveness and accuracy, with a third reviewer resolving disagreements. Readability was measured using five standardized indices (Flesch Reading Ease [FRE], Simple Measure of Gobbledygook [SMOG], Gunning Fog Index [GFI], Automated Readability Index [ARI], Reading Level Consensus [ARC]) and empathy was rated on a 4-point Likert scale by three reviewers. Results: Gemini produced comprehensive and accurate answers for 94.9% (37/39) of questions, with two rated as mixed (vague/outdated). ChatGPT achieved 89.7% (35/39) comprehensive responses, with four rated mixed. Domain-wise, both models performed best in “symptoms and diagnosis” and “treatment”, while mixed responses were most frequent in “general understanding” and “lifestyle”. There was no significant difference in comprehensiveness (p = 0.67). Readability analysis showed both LLMs generated difficult-to-read content: Gemini’s FRE score was 35.83 ± 3.31 vs. ChatGPT’s 32.33 ± 5.57 (p = 0.21), corresponding to college-level proficiency. ChatGPT’s responses were more empathetic, with all responses rated moderately empathetic; Gemini was mostly rated minimally empathetic (66.7%). Conclusion: While ChatGPT and Gemini provided extensive information, their limitations—such as complex language and occasional inaccuracies—must be addressed. Future improvements should focus on enhancing readability, contextual relevance and accuracy to better meet the diverse needs of patients and clinicians.

Original languageEnglish
JournalIndian Journal of Gastroenterology
DOIs
Publication statusAccepted/In press - 2025

All Science Journal Classification (ASJC) codes

  • Gastroenterology

Fingerprint

Dive into the research topics of 'Evaluation of artificial intelligence-based patient education models for irritable bowel syndrome'. Together they form a unique fingerprint.

Cite this