1. Assessment of Pathology Domain-Specific Knowledge of ChatGPT and Comparison to Human Performance
- Author
-
Wang, Andrew Y., Lin, Sherman, Tran, Christopher, Homer, Robert J., Wilsdon, Dan, Walsh, Joanna C., Goebel, Emily A., Sansano, Irene, Sonawane, Snehal, Cockenpot, Vincent, Mukhopadhyay, Sanjay, Taskin, Toros, Zahra, Nusrat, Cima, Luca, Semerci, Orhan, Ozamrak, Birsen Gizem, Mishra, Pallavi, Vennavalli, Naga Sarika, Chen, Po-Hsuan Cameron, and Cecchini, Matthew J.
- Subjects
Artificial intelligence ,Decision support software ,Technology application ,Clinical pathology -- Technology application ,Artificial intelligence -- Usage ,Decision support systems -- Usage - Abstract
* Context.-Artificial intelligence algorithms hold the potential to fundamentally change many aspects of society. Application of these tools, including the publicly available ChatGPT, has demonstrated impressive domain-specific knowledge in many areas, including medicine. Objectives.-To understand the level of pathology domain-specific knowledge for ChatGPT using different underlying large language models, GPT-3.5 and the updated GPT-4. Design.-An international group of pathologists (n = 15) was recruited to generate pathology-specific questions at a similar level to those that could be seen on licensing (board) examinations. The questions (n = 15) were answered by GPT-3.5, GPT-4, and a staff pathologist who recently passed their Canadian pathology licensing exams. Participants were instructed to score answers on a 5-point scale and to predict which answer was written by ChatGPT. Results.-GPT-3.5 performed at a similar level to the staff pathologist, while GPT-4 outperformed both. The overall score for both GPT-3.5 and GPT-4 was within the range of meeting expectations for a trainee writing licensing examinations. In all but one question, the reviewers were able to correctly identify the answers generated by GPT-3.5. Conclusions.-By demonstrating the ability of ChatGPT to answer pathology-specific questions at a level similar to (GPT-3.5) or exceeding (GPT-4) a trained pathologist, this study highlights the potential of large language models to be transformative in this space. In the future, more advanced iterations of these algorithms with increased domain-specific knowledge may have the potential to assist pathologists and enhance pathology resident training. (Arch Pathol Lab Med. 2024;148:1152-1158; doi: 10.5858/arpa.2023-0296-OA), Rapid technological advancements in molecular pathology and a continuously growing body of knowledge have resulted in an unprecedented increase in diagnostic complexity for pathologists. (1) This presents a challenge for [...]
- Published
- 2024
- Full Text
- View/download PDF