Back to Search Start Over

Large language models: a new frontier in paediatric cataract patient education.

Authors :
Dihan, Qais
Chauhan, Muhammad Z.
Eleiwa, Taher K.
Brown, Andrew D.
Hassan, Amr K.
Khodeiry, Mohamed M.
Elsheikh, Reem H.
Oke, Isdin
Nihalani, Bharti R.
VanderVeen, Deborah K.
Sallam, Ahmed B.
Elhusseiny, Abdelrahman M.
Source :
British Journal of Ophthalmology; Oct2024, Vol. 108 Issue 10, p1470-1476, 16p
Publication Year :
2024

Abstract

Background/aims This was a cross-sectional comparative study. We evaluated the ability of three large language models (LLMs) (ChatGPT-3.5, ChatGPT-4, and Google Bard) to generate novel patient education materials (PEMs) and improve the readability of existing PEMs on paediatric cataract. Methods We compared LLMs' responses to three prompts. Prompt A requested they write a handout on paediatric cataract that was 'easily understandable by an average American.' Prompt B modified prompt A and requested the handout be written at a 'sixth-grade reading level, using the Simple Measure of Gobbledygook (SMOG) readability formula.' Prompt C rewrote existing PEMs on paediatric cataract 'to a sixth-grade reading level using the SMOG readability formula'. Responses were compared on their quality (DISCERN; 1 (low quality) to 5 (high quality)), understandability and actionability (Patient Education Materials Assessment Tool (≥70%: understandable, ≥70%: actionable)), accuracy (Likert misinformation; 1 (no misinformation) to 5 (high misinformation) and readability (SMOG, Flesch-Kincaid Grade Level (FKGL); grade level <7: highly readable). Results All LLM-generated responses were of high-quality (median DISCERN ≥4), understandability (≥70%), and accuracy (Likert=1). All LLM-generated responses were not actionable (<70%). ChatGPT-3.5 and ChatGPT-4 prompt B responses were more readable than prompt A responses (p<0.001). ChatGPT-4 generated more readable responses (lower SMOG and FKGL scores; 5.59±0.5 and 4.31±0.7, respectively) than the other two LLMs (p<0.001) and consistently rewrote them to or below the specified sixth-grade reading level (SMOG: 5.14±0.3). Conclusion LLMs, particularly ChatGPT-4, proved valuable in generating high-quality, readable, accurate PEMs and in improving the readability of existing materials on paediatric cataract. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00071161
Volume :
108
Issue :
10
Database :
Complementary Index
Journal :
British Journal of Ophthalmology
Publication Type :
Academic Journal
Accession number :
180483037
Full Text :
https://doi.org/10.1136/bjo-2024-325252