Back to Search Start Over

Evaluating ChatGPT-4 for the Interpretation of Images from Several Diagnostic Techniques in Gastroenterology.

Authors :
Saraiva MM
Ribeiro T
Agudo B
Afonso J
Mendes F
Martins M
Cardoso P
Mota J
Almeida MJ
Costa A
Gonzalez Haba Ruiz M
Widmer J
Moura E
Javed A
Manzione T
Nadal S
Barroso LF
de Parades V
Ferreira J
Macedo G
Source :
Journal of clinical medicine [J Clin Med] 2025 Jan 17; Vol. 14 (2). Date of Electronic Publication: 2025 Jan 17.
Publication Year :
2025

Abstract

Background: Several artificial intelligence systems based on large language models (LLMs) have been commercially developed, with recent interest in integrating them for clinical questions. Recent versions now include image analysis capacity, but their performance in gastroenterology remains untested. This study assesses ChatGPT-4's performance in interpreting gastroenterology images. Methods: A total of 740 images from five procedures-capsule endoscopy (CE), device-assisted enteroscopy (DAE), endoscopic ultrasound (EUS), digital single-operator cholangioscopy (DSOC), and high-resolution anoscopy (HRA)-were included and analyzed by ChatGPT-4 using a predefined prompt for each. ChatGPT-4 predictions were compared to gold standard diagnoses. Statistical analyses included accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and area under the curve (AUC). Results: For CE, ChatGPT-4 demonstrated accuracies ranging from 50.0% to 90.0%, with AUCs of 0.50-0.90. For DAE, the model demonstrated an accuracy of 67.0% (AUC 0.670). For EUS, the system showed AUCs of 0.488 and 0.550 for the differentiation between pancreatic cystic and solid lesions, respectively. The LLM differentiated benign from malignant biliary strictures with an AUC of 0.550. For HRA, ChatGPT-4 showed an overall accuracy between 47.5% and 67.5%. Conclusions: ChatGPT-4 demonstrated suboptimal diagnostic accuracies for image interpretation across several gastroenterology techniques, highlighting the need for continuous improvement before clinical adoption.

Details

Language :
English
ISSN :
2077-0383
Volume :
14
Issue :
2
Database :
MEDLINE
Journal :
Journal of clinical medicine
Publication Type :
Academic Journal
Accession number :
39860582
Full Text :
https://doi.org/10.3390/jcm14020572