1. Accounting for Missing Data in Public Health Research Using a Synthesis of Statistical and Mathematical Models
- Author
-
Zivich, Paul N, Shook-Sa, Bonnie E, Cole, Stephen R, Lofgren, Eric T, and Edwards, Jessie K
- Subjects
Statistics - Applications ,Statistics - Methodology - Abstract
Introduction: Missing data is a challenge to medical research. Accounting for missing data by imputing or weighting conditional on covariates relies on the variable with missingness being observed at least some of the time for all unique covariate values. This requirement is referred to as positivity, and violations can result in bias. Here, we review a novel approach to addressing positivity violations in the context of systolic blood pressure. Methods: To illustrate the proposed approach, we estimate the mean systolic blood pressure among children and adolescents aged 2-17 years old in the United States using data from 2017-2018 National Health and Nutrition Examination Survey (NHANES). As blood pressure was never measured for those aged 2-7, there exists a positivity violation by design. Using a recently proposed synthesis of statistical and mathematical models, we integrate external information with NHANES to address our motivating question. Results: With the synthesis model, the estimated mean systolic blood pressure was 100.5 (95\% confidence interval: 99.9, 101.0), which is notably lower than either a complete-case analysis or extrapolation from a statistical model. The synthesis results were supported by a diagnostic comparing the performance of the mathematical model in the positive region. Conclusion: Positivity violations pose a threat to quantitative medical research, and standard approaches to addressing nonpositivity rely on restrictive untestable assumptions. Using a synthesis model, like the one detailed here, offers a viable alternative through integration of external information.
- Published
- 2025