Start Over

Preliminary assessment of automated radiology report generation with generative pre-trained transformers: comparing results to radiologist-generated reports.

Authors :: Nakaura T
Yoshida N
Kobayashi N
Shiraishi K
Nagayama Y
Uetani H
Kidoh M
Hokamura M
Funama Y
Hirai T
Source :: Japanese journal of radiology [Jpn J Radiol] 2024 Feb; Vol. 42 (2), pp. 190-200. Date of Electronic Publication: 2023 Sep 15.
Publication Year :: 2024
Abstract: Purpose: In this preliminary study, we aimed to evaluate the potential of the generative pre-trained transformer (GPT) series for generating radiology reports from concise imaging findings and compare its performance with radiologist-generated reports.<br />Methods: This retrospective study involved 28 patients who underwent computed tomography (CT) scans and had a diagnosed disease with typical imaging findings. Radiology reports were generated using GPT-2, GPT-3.5, and GPT-4 based on the patient's age, gender, disease site, and imaging findings. We calculated the top-1, top-5 accuracy, and mean average precision (MAP) of differential diagnoses for GPT-2, GPT-3.5, GPT-4, and radiologists. Two board-certified radiologists evaluated the grammar and readability, image findings, impression, differential diagnosis, and overall quality of all reports using a 4-point scale.<br />Results: Top-1 and Top-5 accuracies for the different diagnoses were highest for radiologists, followed by GPT-4, GPT-3.5, and GPT-2, in that order (Top-1: 1.00, 0.54, 0.54, and 0.21, respectively; Top-5: 1.00, 0.96, 0.89, and 0.54, respectively). There were no significant differences in qualitative scores about grammar and readability, image findings, and overall quality between radiologists and GPT-3.5 or GPT-4 (p > 0.05). However, qualitative scores of the GPT series in impression and differential diagnosis scores were significantly lower than those of radiologists (p < 0.05).<br />Conclusions: Our preliminary study suggests that GPT-3.5 and GPT-4 have the possibility to generate radiology reports with high readability and reasonable image findings from very short keywords; however, concerns persist regarding the accuracy of impressions and differential diagnoses, thereby requiring verification by radiologists.<br /> (© 2023. The Author(s).)

Subjects :: Humans
Retrospective Studies
Radiography
Tomography, X-Ray Computed
Radiologists
Radiology

Details

Language :: English
ISSN :: 1867-108X
Volume :: 42
Issue :: 2
Database :: MEDLINE
Journal :: Japanese journal of radiology
Publication Type :: Academic Journal
Accession number :: 37713022
Full Text :: https://doi.org/10.1007/s11604-023-01487-y

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Preliminary assessment of automated radiology report generation with generative pre-trained transformers: comparing results to radiologist-generated reports.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Preliminary assessment of automated radiology report generation with generative pre-trained transformers: comparing results to radiologist-generated reports.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources