Start Over

Evaluating the Effectiveness of Artificial Intelligenceepowered Large Language Models Application in Disseminating Appropriate and Readable Health Information in Urology.

Authors :: Davis, Ryan
Eppler, Michael
Ayo-Ajibola, Oluwatobiloba
Loh-Doyle, Jeffrey C.
Nabhani, Jamal
Samplaski, Mary
Gill, Inderbir
Cacciamani, Giovanni E.
Source :: Journal of Urology; Oct2023, Vol. 210 Issue 4, p688-694, 7p
Publication Year :: 2023
Abstract: Purpose: The Internet is a ubiquitous source of medical information, and natural language processors are gaining popularity as alternatives to traditional search engines. However, suitability of their generated content for patients is not well understood. We aimed to evaluate the appropriateness and readability of natural language processor-generated responses to urology-related medical inquiries. Materials and Methods: Eighteen patient questions were developed based on Google Trends and were used as inputs in ChatGPT. Three categories were assessed: oncologic, benign, and emergency. Questions in each category were either treatment or sign/symptom-related questions. Three native English-speaking Board-Certified urologists independently assessed appropriateness of ChatGPT outputs for patient counseling using accuracy, comprehensiveness, and clarity as proxies for appropriateness. Readability was assessed using the Flesch Reading Ease and Flesh-Kincaid Reading Grade Level formulas. Additional measures were created based on validated tools and assessed by 3 independent reviewers. Results: Fourteen of 18 (77.8%) responses were deemed appropriate, with clarity having the most 4 and 5 scores (P [ .01). There was no significant difference in appropriateness of the responses between treatments and symptoms or between different categories of conditions. The most common reason from urologists for low scores was responses lacking information dsometimes vital information. The mean (SD) Flesch Reading Ease score was 35.5 (SD[10.2) and the mean Flesh-Kincaid Reading Grade Level score was 13.5 (1.74). Additional quality assessment scores showed no significant differences between different categories of conditions. Conclusions: Despite impressive capabilities, natural language processors have limitations as sources of medical information. Refinement is crucial before adoption for this purpose. [ABSTRACT FROM AUTHOR]

Subjects :: LANGUAGE models
READABILITY (Literary style)
CHATGPT
NATURAL languages
UROLOGY
SEARCH engines
TUMOR lysis syndrome

Details

Language :: English
ISSN :: 00225347
Volume :: 210
Issue :: 4
Database :: Supplemental Index
Journal :: Journal of Urology
Publication Type :: Academic Journal
Accession number :: 172029663
Full Text :: https://doi.org/10.1097/JU.0000000000003615

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Evaluating the Effectiveness of Artificial Intelligenceepowered Large Language Models Application in Disseminating Appropriate and Readable Health Information in Urology.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Evaluating the Effectiveness of Artificial Intelligenceepowered Large Language Models Application in Disseminating Appropriate and Readable Health Information in Urology.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources