Back to Search Start Over

Comprehensiveness of Large Language Models in Patient Queries on Gingival and Endodontic Health.

Authors :
Zhang Q
Wu Z
Song J
Luo S
Chai Z
Source :
International dental journal [Int Dent J] 2024 Aug 14. Date of Electronic Publication: 2024 Aug 14.
Publication Year :
2024
Publisher :
Ahead of Print

Abstract

Aim: Given the increasing interest in using large language models (LLMs) for self-diagnosis, this study aimed to evaluate the comprehensiveness of two prominent LLMs, ChatGPT-3.5 and ChatGPT-4, in addressing common queries related to gingival and endodontic health across different language contexts and query types.<br />Methods: We assembled a set of 33 common real-life questions related to gingival and endodontic healthcare, including 17 common-sense questions and 16 expert questions. Each question was presented to the LLMs in both English and Chinese. Three specialists were invited to evaluate the comprehensiveness of the responses on a five-point Likert scale, where a higher score indicated greater quality responses.<br />Results: LLMs performed significantly better in English, with an average score of 4.53, compared to 3.95 in Chinese (Mann-Whitney U test, P < .05). Responses to common sense questions received higher scores than those to expert questions, with averages of 4.46 and 4.02 (Mann-Whitney U test, P < .05). Among the LLMs, ChatGPT-4 consistently outperformed ChatGPT-3.5, achieving average scores of 4.45 and 4.03 (Mann-Whitney U test, P < .05).<br />Conclusions: ChatGPT-4 provides more comprehensive responses than ChatGPT-3.5 for queries related to gingival and endodontic health. Both LLMs perform better in English and on common sense questions. However, the performance discrepancies across different language contexts and the presence of inaccurate responses suggest that further evaluation and understanding of their limitations are crucial to avoid potential misunderstandings.<br />Clinical Relevance: This study revealed the performance differences of ChatGPT-3.5 and ChatGPT-4 in handling gingival and endodontic health issues across different language contexts, providing insights into the comprehensiveness and limitations of LLMs in addressing common oral healthcare queries.<br />Competing Interests: Conflict of interest The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Zhaowu Chai reports financial support was provided by Stomatological Hospital of Chongqing Medical University. Zhaowu Chai reports a relationship with Stomatological Hospital of Chongqing Medical University that includes: employment. Zhaowu Chai has patent pending to Zhaowu Chai. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this article.<br /> (Copyright © 2024 The Authors. Published by Elsevier Inc. All rights reserved.)

Details

Language :
English
ISSN :
1875-595X
Database :
MEDLINE
Journal :
International dental journal
Publication Type :
Academic Journal
Accession number :
39147663
Full Text :
https://doi.org/10.1016/j.identj.2024.06.022