Back to Search
Start Over
On the development and validation of large language model-based classifiers for identifying social determinants of health.
- Source :
-
Proceedings of the National Academy of Sciences of the United States of America [Proc Natl Acad Sci U S A] 2024 Sep 24; Vol. 121 (39), pp. e2320716121. Date of Electronic Publication: 2024 Sep 16. - Publication Year :
- 2024
-
Abstract
- The assessment of social determinants of health (SDoH) within healthcare systems is crucial for comprehensive patient care and addressing health disparities. Current challenges arise from the limited inclusion of structured SDoH information within electronic health record (EHR) systems, often due to the lack of standardized diagnosis codes. This study delves into the transformative potential of large language models (LLM) to overcome these challenges. LLM-based classifiers-using Bidirectional Encoder Representations from Transformers (BERT) and A Robustly Optimized BERT Pretraining Approach (RoBERTa)-were developed for SDoH concepts, including homelessness, food insecurity, and domestic violence, using synthetic training datasets generated by generative pre-trained transformers combined with authentic clinical notes. Models were then validated on separate datasets: Medical Information Mart for Intensive Care-III and our institutional EHR data. When training the model with a combination of synthetic and authentic notes, validation on our institutional dataset yielded an area under the receiver operating characteristics curve of 0.78 for detecting homelessness, 0.72 for detecting food insecurity, and 0.83 for detecting domestic violence. This study underscores the potential of LLMs in extracting SDoH information from clinical text. Automated detection of SDoH may be instrumental for healthcare providers in identifying at-risk patients, guiding targeted interventions, and contributing to population health initiatives aimed at mitigating disparities.<br />Competing Interests: Competing interests statement:The authors declare no competing interest.
Details
- Language :
- English
- ISSN :
- 1091-6490
- Volume :
- 121
- Issue :
- 39
- Database :
- MEDLINE
- Journal :
- Proceedings of the National Academy of Sciences of the United States of America
- Publication Type :
- Academic Journal
- Accession number :
- 39284061
- Full Text :
- https://doi.org/10.1073/pnas.2320716121