Start Over

Comparison of Prompt Engineering and Fine-Tuning Strategies in Large Language Models in the Classification of Clinical Notes.

Authors :: Zhang X
Talukdar N
Vemulapalli S
Ahn S
Wang J
Meng H
Murtaza SMB
Leshchiner D
Dave AA
Joseph DF
Witteveen-Lane M
Chesla D
Zhou J
Chen B
Source :: MedRxiv : the preprint server for health sciences [medRxiv] 2024 Feb 08. Date of Electronic Publication: 2024 Feb 08.
Publication Year :: 2024
Abstract: The emerging large language models (LLMs) are actively evaluated in various fields including healthcare. Most studies have focused on established benchmarks and standard parameters; however, the variation and impact of prompt engineering and fine-tuning strategies have not been fully explored. This study benchmarks GPT-3.5 Turbo, GPT-4, and Llama-7B against BERT models and medical fellows' annotations in identifying patients with metastatic cancer from discharge summaries. Results revealed that clear, concise prompts incorporating reasoning steps significantly enhanced performance. GPT-4 exhibited superior performance among all models. Notably, one-shot learning and fine-tuning provided no incremental benefit. The model's accuracy sustained even when keywords for metastatic cancer were removed or when half of the input tokens were randomly discarded. These findings underscore GPT-4's potential to substitute specialized models, such as PubMedBERT, through strategic prompt engineering, and suggest opportunities to improve open-source models, which are better suited to use in clinical settings.<br />Competing Interests: Competing interests The authors declare no competing interests.

Details

Language :: English
Database :: MEDLINE
Journal :: MedRxiv : the preprint server for health sciences
Accession number :: 38370673
Full Text :: https://doi.org/10.1101/2024.02.07.24302444

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Comparison of Prompt Engineering and Fine-Tuning Strategies in Large Language Models in the Classification of Clinical Notes.

Abstract

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Comparison of Prompt Engineering and Fine-Tuning Strategies in Large Language Models in the Classification of Clinical Notes.

Abstract

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources