1. Automated HEART score determination via ChatGPT: Honing a framework for iterative prompt development
- Author
-
Conrad W. Safranek, Thomas Huang, Donald S. Wright, Catherine X. Wright, Vimig Socrates, Rohit B. Sangal, Mark Iscoe, David Chartash, and R. Andrew Taylor
- Subjects
artificial intelligence in medicine ,ChatGPT ,clinical decision support systems ,clinical note analysis ,emergency department risk algorithms ,HEART score ,Medical emergencies. Critical care. Intensive care. First aid ,RC86-88.9 - Abstract
Abstract Objectives This study presents a design framework to enhance the accuracy by which large language models (LLMs), like ChatGPT can extract insights from clinical notes. We highlight this framework via prompt refinement for the automated determination of HEART (History, ECG, Age, Risk factors, Troponin risk algorithm) scores in chest pain evaluation. Methods We developed a pipeline for LLM prompt testing, employing stochastic repeat testing and quantifying response errors relative to physician assessment. We evaluated the pipeline for automated HEART score determination across a limited set of 24 synthetic clinical notes representing four simulated patients. To assess whether iterative prompt design could improve the LLMs’ ability to extract complex clinical concepts and apply rule‐based logic to translate them to HEART subscores, we monitored diagnostic performance during prompt iteration. Results Validation included three iterative rounds of prompt improvement for three HEART subscores with 25 repeat trials totaling 1200 queries each for GPT‐3.5 and GPT‐4. For both LLM models, from initial to final prompt design, there was a decrease in the rate of responses with erroneous, non‐numerical subscore answers. Accuracy of numerical responses for HEART subscores (discrete 0–2 point scale) improved for GPT‐4 from the initial to final prompt iteration, decreasing from a mean error of 0.16–0.10 (95% confidence interval: 0.07–0.14) points. Conclusion We established a framework for iterative prompt design in the clinical space. Although the results indicate potential for integrating LLMs in structured clinical note analysis, translation to real, large‐scale clinical data with appropriate data privacy safeguards is needed.
- Published
- 2024
- Full Text
- View/download PDF