Back to Search Start Over

On the development and validation of large language model-based classifiers for identifying social determinants of health.

Authors :
Gabriel RA
Litake O
Simpson S
Burton BN
Waterman RS
Macias AA
Source :
Proceedings of the National Academy of Sciences of the United States of America [Proc Natl Acad Sci U S A] 2024 Sep 24; Vol. 121 (39), pp. e2320716121. Date of Electronic Publication: 2024 Sep 16.
Publication Year :
2024

Abstract

The assessment of social determinants of health (SDoH) within healthcare systems is crucial for comprehensive patient care and addressing health disparities. Current challenges arise from the limited inclusion of structured SDoH information within electronic health record (EHR) systems, often due to the lack of standardized diagnosis codes. This study delves into the transformative potential of large language models (LLM) to overcome these challenges. LLM-based classifiers-using Bidirectional Encoder Representations from Transformers (BERT) and A Robustly Optimized BERT Pretraining Approach (RoBERTa)-were developed for SDoH concepts, including homelessness, food insecurity, and domestic violence, using synthetic training datasets generated by generative pre-trained transformers combined with authentic clinical notes. Models were then validated on separate datasets: Medical Information Mart for Intensive Care-III and our institutional EHR data. When training the model with a combination of synthetic and authentic notes, validation on our institutional dataset yielded an area under the receiver operating characteristics curve of 0.78 for detecting homelessness, 0.72 for detecting food insecurity, and 0.83 for detecting domestic violence. This study underscores the potential of LLMs in extracting SDoH information from clinical text. Automated detection of SDoH may be instrumental for healthcare providers in identifying at-risk patients, guiding targeted interventions, and contributing to population health initiatives aimed at mitigating disparities.<br />Competing Interests: Competing interests statement:The authors declare no competing interest.

Details

Language :
English
ISSN :
1091-6490
Volume :
121
Issue :
39
Database :
MEDLINE
Journal :
Proceedings of the National Academy of Sciences of the United States of America
Publication Type :
Academic Journal
Accession number :
39284061
Full Text :
https://doi.org/10.1073/pnas.2320716121