1. Accuracy and consistency of ChatGPT-3.5 and - 4 in providing differential diagnoses in oral and maxillofacial diseases: a comparative diagnostic performance analysis.
- Author
-
Tomo S, Lechien JR, Bueno HS, Cantieri-Debortoli DF, and Simonato LE
- Subjects
- Humans, Diagnosis, Differential, Male, Female, Clinical Competence, Mouth Diseases diagnosis
- Abstract
Objective: To investigate the performance of ChatGPT in the differential diagnosis of oral and maxillofacial diseases., Methods: Thirty-seven oral and maxillofacial lesions findings were presented to ChatGPT-3.5 and - 4, 18 dental surgeons trained in oral medicine/pathology (OMP), 23 general dental surgeons (DDS), and 16 dental students (DS) for differential diagnosis. Additionally, a group of 15 general dentists was asked to describe 11 cases to ChatGPT versions. The ChatGPT-3.5, -4, and human primary and alternative diagnoses were rated by 2 independent investigators with a 4 Likert-Scale. The consistency of ChatGPT-3.5 and - 4 was evaluated with regenerated inputs., Results: Moderate consistency of outputs was observed for ChatGPT-3.5 and - 4 to provide primary (κ = 0.532 and κ = 0.533 respectively) and alternative (κ = 0.337 and κ = 0.367 respectively) hypotheses. The mean of correct diagnoses was 64.86% for ChatGPT-3.5, 80.18% for ChatGPT-4, 86.64% for OMP, 24.32% for DDS, and 16.67% for DS. The mean correct primary hypothesis rates were 45.95% for ChatGPT-3.5, 61.80% for ChatGPT-4, 82.28% for OMP, 22.72% for DDS, and 15.77% for DS. The mean correct diagnosis rate for ChatGPT-3.5 with standard descriptions was 64.86%, compared to 45.95% with participants' descriptions. For ChatGPT-4, the mean was 80.18% with standard descriptions and 61.80% with participant descriptions., Conclusion: ChatGPT-4 demonstrates an accuracy comparable to specialists to provide differential diagnosis for oral and maxillofacial diseases. Consistency of ChatGPT to provide diagnostic hypotheses for oral diseases cases is moderate, representing a weakness for clinical application. The quality of case documentation and descriptions impacts significantly on the performance of ChatGPT., Clinical Relevance: General dentists, dental students and specialists in oral medicine and pathology may benefit from ChatGPT-4 as an auxiliary method to define differential diagnosis for oral and maxillofacial lesions, but its accuracy is dependent on precise case descriptions., (© 2024. The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.)
- Published
- 2024
- Full Text
- View/download PDF