Back to Search Start Over

Comparison of the Performance of Artificial Intelligence Versus Medical Professionals in the Polish Final Medical Examination.

Authors :
Jaworski A
Jasiński D
Jaworski W
Hop A
Janek A
Sławińska B
Konieczniak L
Rzepka M
Jung M
Sysło O
Jarząbek V
Błecha Z
Haraziński K
Jasińska N
Source :
Cureus [Cureus] 2024 Aug 02; Vol. 16 (8), pp. e66011. Date of Electronic Publication: 2024 Aug 02 (Print Publication: 2024).
Publication Year :
2024

Abstract

Background: The rapid development of artificial intelligence (AI) technologies like OpenAI's Generative Pretrained Transformer (GPT), particularly ChatGPT, has shown promising applications in various fields, including medicine. This study evaluates ChatGPT's performance on the Polish Final Medical Examination (LEK), comparing its efficacy to that of human test-takers.<br />Methods: The study analyzed ChatGPT's ability to answer 196 multiple-choice questions from the spring 2021 LEK. Questions were categorized into "clinical cases" and "other" general medical knowledge, and then divided according to medical fields. Two versions of ChatGPT (3.5 and 4.0) were tested. Statistical analyses, including Pearson's χ <superscript>2</superscript> test, and Mann-Whitney U test, were conducted to compare the AI's performance and confidence levels.<br />Results: ChatGPT 3.5 correctly answered 50.51% of the questions, while ChatGPT 4.0 answered 77.55% correctly, surpassing the 56% passing threshold. Version 3.5 showed significantly higher confidence in correct answers, whereas version 4.0 maintained consistent confidence regardless of answer accuracy. No significant differences in performance were observed across different medical fields.<br />Conclusions: ChatGPT 4.0 demonstrated the ability to pass the LEK, indicating substantial potential for AI in medical education and assessment. Future improvements in AI models, such as the anticipated ChatGPT 5.0, may enhance further performance, potentially equaling or surpassing human test-takers.<br />Competing Interests: Human subjects: All authors have confirmed that this study did not involve human participants or tissue. Animal subjects: All authors have confirmed that this study did not involve animal subjects or tissue. Conflicts of interest: In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.<br /> (Copyright © 2024, Jaworski et al.)

Details

Language :
English
ISSN :
2168-8184
Volume :
16
Issue :
8
Database :
MEDLINE
Journal :
Cureus
Publication Type :
Academic Journal
Accession number :
39221376
Full Text :
https://doi.org/10.7759/cureus.66011