Back to Search Start Over

Identifying and assessing the impact of key neighborhood-level determinants on geographic variation in stroke: a machine learning and multilevel modeling approach

Authors :
Bian Liu
Liangyuan Hu
Jiayi Ji
Yan Li
Source :
BMC Public Health, BMC Public Health, Vol 20, Iss 1, Pp 1-12 (2020)
Publication Year :
2020
Publisher :
BioMed Central, 2020.

Abstract

Background Stroke is a chronic cardiovascular disease that puts major stresses on U.S. health and economy. The prevalence of stroke exhibits a strong geographical pattern at the state-level, where a cluster of southern states with a substantially higher prevalence of stroke has been called the stroke belt of the nation. Despite this recognition, the extent to which key neighborhood characteristics affect stroke prevalence remains to be further clarified. Methods We generated a new neighborhood health data set at the census tract level on nearly 27,000 tracts by pooling information from multiple data sources including the CDC’s 500 Cities Project 2017 data release. We employed a two-stage modeling approach to understand how key neighborhood-level risk factors affect the neighborhood-level stroke prevalence in each state of the US. The first stage used a state-of-the-art Bayesian machine learning algorithm to identify key neighborhood-level determinants. The second stage applied a Bayesian multilevel modeling approach to describe how these key determinants explain the variability in stroke prevalence in each state. Results Neighborhoods with a larger proportion of older adults and non-Hispanic blacks were associated with neighborhoods with a higher prevalence of stroke. Higher median household income was linked to lower stroke prevalence. Ozone was found to be positively associated with stroke prevalence in 10 states, while negatively associated with stroke in five states. There was substantial variation in both the direction and magnitude of the associations between these four key factors with stroke prevalence across the states. Conclusions When used in a principled variable selection framework, high-performance machine learning can identify key factors of neighborhood-level prevalence of stroke from wide-ranging information in a data-driven way. The Bayesian multilevel modeling approach provides a detailed view of the impact of key factors across the states. The identified major factors and their effect mechanisms can potentially aid policy makers in developing area-based stroke prevention strategies.

Details

Language :
English
ISSN :
14712458
Volume :
20
Database :
OpenAIRE
Journal :
BMC Public Health
Accession number :
edsair.doi.dedup.....e4f281095742f2d3f12565bdf7b63f90