1. Use of Retrospective Data for Comparative Effectiveness Research Yields Mixed Outcomes and Should be Avoided
- Author
-
Zaorsky, NG, Wang, X, Lehrer, EJ, Tchelebi, L, Yeich, A, Prasad, V, Chinchilli, VM, and Wang, M
- Subjects
Biomedical and Clinical Sciences ,Oncology and Carcinogenesis ,Breast Cancer ,Comparative Effectiveness Research ,Cancer ,Other Physical Sciences ,Clinical Sciences ,Oncology & Carcinogenesis ,Oncology and carcinogenesis ,Theoretical and computational chemistry ,Medical and biological physics - Abstract
Purpose/objective(s)In oncology, retrospective cohort studies are often used for comparative effectiveness research, studies that compare the efficacy of treatment A vs B. We examine the stability of these estimates using biostatistical methods for bias correction with varying sets of covariates. We hypothesize that retrospective comparative effectiveness research studies are sensitive to biostatistical analytic choices; by varying the methods, there will be significant instability and lack of consistency in conclusions.Materials/methodsWe evaluated three disease sites in oncology where the addition of local therapy over systemic therapy alone has been hypothesized to improve survival in the metastatic setting: lung, prostate, and female breast, using multivariable Cox regression analyses. Patient data were extracted from the National Cancer Database, 2004-2014. We employed various statistical techniques to adjust for selection bias and immortal time bias, including propensity score matching, left truncation adjustment, and landmark analysis. Further, we used combinations of covariates in regression models to generate hazard ratios (HRs) with 95% confidence intervals. We constructed plots of -log10(P-value) vs HR to quantify the variability of estimates.ResultsThere were 72,549 lung, 14,904 prostate, and 13,857 female breast cancer patients included. We ran > 300,000 regression models, where each model represents a publishable study. Without propensity score matching or immortal time bias adjustment, all multivariable models provided HRs that favored the addition of local therapy for all cancers, with HRs < 1, and all P-values < 0.001. Once propensity score matching was added to our analysis, higher HRs were observed, but most were still < 1. When landmark analysis and covariate combinations were used, we generated HRs that were < 1, equal to 1, and > 1, with 100-fold differences in -log10(P-values).ConclusionBy altering the biostatistical approach with varying combinations of covariates, we were able to generate contrary outcomes and statistical significance. Our results suggest that some retrospective observational studies may find a treatment helps, and another may find it does not, simply based on analytic choices. This paradox highlights the importance of randomized controlled trials, and may explain the discordance noted in prior studies comparing observational trials and randomized studies.
- Published
- 2021