Back to Search Start Over

Easy to use and validated predictive models to identify beneficiaries experiencing homelessness in Medicaid administrative data

Authors :
Pourat, Nadereh
Yue, Dahai
Chen, Xiao
Zhou, Weihao
O'Masta, Brenna
Source :
Health Services Research. August, 2023, Vol. 58 Issue 4, p882, 12 p.
Publication Year :
2023

Abstract

Objective: To develop easy to use and validated predictive models to identify beneficiaries experiencing homelessness from administrative data. Data Sources: We pooled enrollment and claims data from enrollees of the California Whole Person Care (WPC) Medicaid demonstration program that coordinated the care of a subset of Medicaid beneficiaries identified as high utilizers in 26 California counties (25 WPC Pilots). We also used public directories of social service and health care facilities. Study Design: Using WPC Pilot-reported homelessness status, we trained seven supervised learning algorithms with different specifications to identify beneficiaries experiencing homelessness. The list of predictors included address- and claims-based indicators, demographics, health status, health care utilization, and county-level homelessness rate. We then assessed model performance using measures of balanced accuracy (BA), sensitivity, specificity, positive predictive value, negative predictive value, and area under the receiver operating characteristic curve (area under the curve [AUC]). Data Collection/Extraction Methods: We included 93,656 WPC enrollees from 2017 to 2018, 37,441 of whom had a WPC Pilot-reported homelessness indicator. Principal Findings: The random forest algorithm with all available indicators had the best performance (87% BA and 0.95 AUC), but a simpler Generalized Linear Model (GLM) also performed well (74% BA and 0.83 AUC). Reducing predictors to the top 20 and top five most important indicators in a GLM model yields only slightly lower performance (86% BA and 0.94 AUC for the top 20 and 86% BA and 0.91 AUC for the top five). Conclusions: Large samples can be used to accurately predict homelessness in Medicaid administrative data if a validated homelessness indicator for a small subset can be obtained. In the absence of a validated indicator, the likelihood of homelessness can be calculated using county rate of homelessness, address- and claim-based indicators, and beneficiary age using a prediction model presented here. These approaches are needed given the rising prevalence of homelessness and the focus of Medicaid and other payers on addressing homelessness and its outcomes. KEYWORDS administrative data, homelessness, machine learning algorithms, Medicaid What is known on this topic * Homelessness is a social determinant of health and well-established evidence demonstrates that individuals experiencing homelessness have poor health and high use of health care. * Addressing social determinants of health is increasingly a goal of public and private payers and providers, but most lack data on the homelessness status of the populations they serve. * Various methods of identifying homelessness using administrative data have been tried using specific populations and limited data, but their accuracy in determining homelessness is unknown. What this study adds * We identify easy and validated predictive models to identify individuals experiencing homelessness using variables available in administrative data. * We identify the top 20 and top five most important variables in predicting homelessness. * We offer more advanced and simpler but well-performing logit regression models and the related regression coefficients that could be easily applied to identify homelessness.<br />1 | INTRODUCTION Over half a million persons are estimated to be experiencing homelessness In the United States on a given night. (1) Well-established evidence demonstrates that individuals experiencing homelessness [...]

Details

Language :
English
ISSN :
00179124
Volume :
58
Issue :
4
Database :
Gale General OneFile
Journal :
Health Services Research
Publication Type :
Periodical
Accession number :
edsgcl.760306725
Full Text :
https://doi.org/10.1111/1475-6773.14143