Duarte, Martha, Salamanca, Mayra, Gonzalez, Juan M., Roman Laporte, Roberto, Gattamorta, Karina, Lopez Martinez, Fernando Enrique, Clochesy, John, and Rincon Acuna, Juan Carlos
Depression is recognized as a significant public health issue in the United States. The National Survey on Drug Use and Health reports that 21.0 million adults aged 18 or older had major depressive disorder in 2020, including 14.8 million experiencing a major depressive episode with severe impairment. The aim is to predict the positivity of Patient Health Questionnaire-2 (PHQ-2) outcomes among patients in primary care settings by analyzing a range of variables, including socioeconomic status, demographic characteristics, and health behaviors, thereby identifying those at increased risk for depression. Employing a machine learning approach, the study utilizes retrospective data from electronic health records across 15 primary care clinics in South Florida to explore the relationship between social determinants of health (SDoH), including area of deprivation index (ADI) and PHQ-2 positivity. The study encompasses 15 primary care clinics located in South Florida, where a diverse patient population receives care. Analysis included 94,572 patient visits; 74,636 records were included in the study. If a zip+4 was not available or an ADI score did not exist, the visit was not included in the final analysis. Screening involved the PHQ-2, assessing depressed mood and anhedonia, with a cutoff >2 indicating positive screening. ADI was used to assess SDoH by matching patients' residential postal codes to ADI national percentiles. Demographics, sexual history, tobacco use, caffeine intake, and community involvement were also evaluated in the study. Over 40 machine learning algorithms were explored for their accuracy in predicting PHQ-2 outcomes, using software tools including Scikit-learn and stats models in Python. Variables were normalized, scored, and then subjected to predictive regression models, with Random Forest showing outstanding performance. Feature engineering and correlation analysis identified ADI, age, education, visit type, coffee intake, and marital status as significant predictors of PHQ-2 positivity. The area under the curve and model accuracies varied across clinics, with specific clinics showing higher predictive accuracy and others (p >.05). The study concludes that the ADI, as a proxy for SDoH, alongside other individual factors, can predict PHQ-2 positivity. Health organizations can use this information to anticipate health needs and resource allocation. [ABSTRACT FROM AUTHOR]