1. Large language models (LLMs) in the evaluation of emergency radiology reports: performance of ChatGPT-4, Perplexity, and Bard
- Author
-
Infante, A., Gaudino, Simona, Orsini, Federico, Del Ciello, Annemilia, Gullì, C., Merlino, Biagio, Natale, Luigi, Iezzi, Roberto, Sala, Evis, Gaudino, S. (ORCID:0000-0003-1681-4343), Orsini, F., Del Ciello, A., Merlino, B. (ORCID:0000-0003-1104-3463), Natale, L. (ORCID:0000-0002-7949-5119), Iezzi, R. (ORCID:0000-0002-2791-481X), Sala, E., Infante, A., Gaudino, Simona, Orsini, Federico, Del Ciello, Annemilia, Gullì, C., Merlino, Biagio, Natale, Luigi, Iezzi, Roberto, Sala, Evis, Gaudino, S. (ORCID:0000-0003-1681-4343), Orsini, F., Del Ciello, A., Merlino, B. (ORCID:0000-0003-1104-3463), Natale, L. (ORCID:0000-0002-7949-5119), Iezzi, R. (ORCID:0000-0002-2791-481X), and Sala, E.
- Abstract
Large language models (LLMs), especially those based on the Generative Pre-trained Transformer (GPT) architecture, have becomewidely popularand have beenappliedinvarious fields due to their ability to provide written responses to a diverse range of queries swiftly and accurately. LLMs have demonstrated a transformative and potentially revolutionary capacity in multiple medical subfields, including radiology.1 A promising utilisation of these models is to streamline free-text radiology reports into concise or structured formats, 2,3 thereby enhancing accessibility and organisation of extensive information, potentially facilitating communication among medical professionals. Furthermore, incorporating automated radiological structured reporting systems could enhance clinical procedures, standardising language across institutions, promoting effective communication among healthcare experts, and improving the efficiency of data extraction for research purposes. The present authors share their preliminary results with three LLMs evaluating their accuracy in extracting emergency data recognition within a human-generated emergency radiology report.
- Published
- 2024