1. Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study.
- Author
-
Kawabata E, Major-Smith D, Clayton GL, Shapland CY, Morris TP, Carter AR, Fernández-Sanlés A, Borges MC, Tilling K, Griffith GJ, Millard LAC, Smith GD, Lawlor DA, and Hughes RA
- Subjects
- Humans, Data Interpretation, Statistical, Probability, Models, Statistical, Research Design statistics & numerical data, Algorithms, Monte Carlo Method, Bayes Theorem, Bias, Computer Simulation
- Abstract
Background: Bias from data missing not at random (MNAR) is a persistent concern in health-related research. A bias analysis quantitatively assesses how conclusions change under different assumptions about missingness using bias parameters that govern the magnitude and direction of the bias. Probabilistic bias analysis specifies a prior distribution for these parameters, explicitly incorporating available information and uncertainty about their true values. A Bayesian bias analysis combines the prior distribution with the data's likelihood function whilst a Monte Carlo bias analysis samples the bias parameters directly from the prior distribution. No study has compared a Monte Carlo bias analysis to a Bayesian bias analysis in the context of MNAR missingness., Methods: We illustrate an accessible probabilistic bias analysis using the Monte Carlo bias analysis approach and a well-known imputation method. We designed a simulation study based on a motivating example from the UK Biobank study, where a large proportion of the outcome was missing and missingness was suspected to be MNAR. We compared the performance of our Monte Carlo bias analysis to a principled Bayesian bias analysis, complete case analysis (CCA) and multiple imputation (MI) assuming missing at random., Results: As expected, given the simulation study design, CCA and MI estimates were substantially biased, with 95% confidence interval coverages of 7-48%. Including auxiliary variables (i.e., variables not included in the substantive analysis that are predictive of missingness and the missing data) in MI's imputation model amplified the bias due to assuming missing at random. With reasonably accurate and precise information about the bias parameter, the Monte Carlo bias analysis performed as well as the Bayesian bias analysis. However, when very limited information was provided about the bias parameter, only the Bayesian bias analysis was able to eliminate most of the bias due to MNAR whilst the Monte Carlo bias analysis performed no better than the CCA and MI., Conclusion: The Monte Carlo bias analysis we describe is easy to implement in standard software and, in the setting we explored, is a viable alternative to a Bayesian bias analysis. We caution careful consideration of choice of auxiliary variables when applying imputation where data may be MNAR., Competing Interests: Declarations Ethics approval and consent to participate For the simulation study, data were completely simulated, which did not require approval from an ethics committee or consent from participants. UKB received ethical approval from the UK National Health Service’s National Research Ethics Service (ref. 11/NW/0382). All participants provided written and informed consent for data collection, analysis, and record linkage. This research was conducted under UKB application number 16729. Consent for publication Not applicable. Competing interests TPM has received consultancy fees from: Bayer Healthcare Pharmaceuticals, Alliance Pharmaceuticals, Gilead Sciences, and Kite Pharmaceuticals. Since January 2023, ARC has been an employee of Novo Nordisk Research Centre Oxford, which is not related to the current work and had no involvement in the decision to publish. The remaining authors declare that they have no competing interests., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF