1. The performance of artificial intelligence models in generating responses to general orthodontic questions: ChatGPT vs Google Bard.
- Author
-
Daraqel, Baraa, Wafaie, Khaled, Mohammed, Hisham, Cao, Li, Mheissen, Samer, Liu, Yang, and Zheng, Leilei
- Abstract
This study aimed to evaluate and compare the performance of 2 artificial intelligence (AI) models, Chat Generative Pretrained Transformer-3.5 (ChatGPT-3.5; OpenAI, San Francisco, Calif) and Google Bidirectional Encoder Representations from Transformers (Google Bard; Bard Experiment, Google, Mountain View, Calif), in terms of response accuracy, completeness, generation time, and response length when answering general orthodontic questions. A team of orthodontic specialists developed a set of 100 questions in 10 orthodontic domains. One author submitted the questions to both ChatGPT and Google Bard. The AI-generated responses from both models were randomly assigned into 2 forms and sent to 5 blinded and independent assessors. The quality of AI-generated responses was evaluated using a newly developed tool for accuracy of information and completeness. In addition, response generation time and length were recorded. The accuracy and completeness of responses were high in both AI models. The median accuracy score was 9 (interquartile range [IQR]: 8-9) for ChatGPT and 8 (IQR: 8-9) for Google Bard (Median difference: 1; P <0.001). The median completeness score was similar in both models, with 8 (IQR: 8-9) for ChatGPT and 8 (IQR: 7-9) for Google Bard. The odds of accuracy and completeness were higher by 31% and 23% in ChatGPT than in Google Bard. Google Bard's response generation time was significantly shorter than that of ChatGPT by 10.4 second/question. However, both models were similar in terms of response length generation. Both ChatGPT and Google Bard generated responses were rated with a high level of accuracy and completeness to the posed general orthodontic questions. However, acquiring answers was generally faster using the Google Bard model. • ChatGPT and Google Bard models generated responses with a high level of accuracy and completeness. • ChatGPT and Google Bard models demonstrated relative consistency in their performance. • Response generation was faster with Google Bard. • ChatGPT and Google Bard models were similar in terms of response length generation. • A newly developed accuracy of information index was used in this study. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF