Back to Search
Start Over
Multiple imputation of semi-continuous exposure variables that are categorized for analysis.
- Source :
-
Statistics in medicine [Stat Med] 2021 Nov 30; Vol. 40 (27), pp. 6093-6106. Date of Electronic Publication: 2021 Aug 23. - Publication Year :
- 2021
-
Abstract
- Semi-continuous variables are characterized by a point mass at one value and a continuous range of values for remaining observations. An example is alcohol consumption quantity, with a spike of zeros representing non-drinkers and positive values for drinkers. If multiple imputation is used to handle missing values for semi-continuous variables, it is unclear how this should be implemented within the standard approaches of fully conditional specification (FCS) and multivariate normal imputation (MVNI). This question is brought into focus by the use of categorized versions of semi-continuous exposure variables in analyses (eg, no drinking, drinking below binge level, binge drinking, heavy binge drinking), raising the question of how best to achieve congeniality between imputation and analysis models. We performed a simulation study comparing nine approaches for imputing semi-continuous exposures requiring categorization for analysis. Three methods imputed the categories directly: ordinal logistic regression, and imputation of binary indicator variables representing the categories using MVNI (with two variants). Six methods (predictive mean matching, zero-inflated binomial imputation, and two-part imputation methods with variants in FCS and MVNI) imputed the semi-continuous variable, with categories derived after imputation. The ordinal and zero-inflated binomial methods had good performance across most scenarios, while MVNI methods requiring rounding after imputation did not perform well. There were mixed results for predictive mean matching and the two-part methods, depending on whether the estimands were proportions or regression coefficients. The results highlight the need to consider the parameter of interest when selecting an imputation procedure.<br /> (© 2021 John Wiley & Sons Ltd.)
- Subjects :
- Computer Simulation
Humans
Logistic Models
Data Collection methods
Research Design
Subjects
Details
- Language :
- English
- ISSN :
- 1097-0258
- Volume :
- 40
- Issue :
- 27
- Database :
- MEDLINE
- Journal :
- Statistics in medicine
- Publication Type :
- Academic Journal
- Accession number :
- 34423450
- Full Text :
- https://doi.org/10.1002/sim.9172