Descriptor: "Models, Statistical" / Publisher: wiley - Searchworks@Jio Institute Digital Library Search Results

1. Extending the DeLong algorithm for comparing areas under correlated receiver operating characteristic curves with missing data.

Author: Zou L, Choi YH, Guizzetti L, Shu D, Zou J, and Zou G
Subjects: Humans, Models, Statistical, Statistics, Nonparametric, Data Interpretation, Statistical, Multivariate Analysis, Algorithms, ROC Curve, Computer Simulation, Area Under Curve
Abstract: A nonparametric method proposed by DeLong et al in 1988 for comparing areas under correlated receiver operating characteristic curves is used widely in practice. However, the DeLong method as implemented in popular software quietly deletes individuals with any missing values, yielding potentially invalid and/or inefficient results. We simplify the DeLong algorithm using ranks and extend it to accommodate missing data by using a mixed model approach for multivariate data. Simulation results demonstrate the validity and efficiency of our procedure for data missing at random. We illustrate our proposed procedure in SAS, Stata, and R using the original DeLong data., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

2. Prefiltered component-based greedy (PreCoG) scan method.

Author: French JP, Meysami M, and Lipner EM
Subjects: Humans, Cluster Analysis, Computer Simulation, Risk Factors, Models, Statistical, Algorithms
Abstract: The spatial distribution of disease cases can provide important insights into disease spread and its potential risk factors. Identifying disease clusters correctly can help us discover new risk factors and inform interventions to control and prevent the spread of disease as quickly as possible. In this study, we propose a novel scan method, the Prefiltered Component-based Greedy (PreCoG) scan method, which efficiently and accurately detects irregularly shaped clusters using a prefiltered component-based algorithm. The PreCoG scan method's flexibility allows it to perform well in detecting both regularly and irregularly-shaped clusters. Additionally, it is fast to apply while providing high power, sensitivity, and positive predictive value for the detected clusters compared to other scan methods. To confirm the effectiveness of the PreCoG method, we compare its performance to many other scan methods. Additionally, we have implemented this method in the smerc R package to make it publicly available to other researchers. Our proposed PreCoG scan method presents a unique and innovative process for detecting disease clusters and can improve the accuracy of disease surveillance systems., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

3. Determining sample size in a personalized randomized controlled (PRACTical) trial.

Author: Turner RM, Lee KM, Walker AS, Ellis S, Sharland M, Bielicki JA, Stöhr W, and White IR
Subjects: Humans, Sample Size, Computer Simulation, Infant, Newborn, Sepsis drug therapy, Models, Statistical, Randomized Controlled Trials as Topic methods, Precision Medicine methods
Abstract: In clinical settings with no commonly accepted standard-of-care, multiple treatment regimens are potentially useful, but some treatments may not be appropriate for some patients. A personalized randomized controlled trial (PRACTical) design has been proposed for this setting. For a network of treatments, each patient is randomized only among treatments which are appropriate for them. The aim is to produce treatment rankings that can inform clinical decisions about treatment choices for individual patients. Here we propose methods for determining sample size in a PRACTical design, since standard power-based methods are not applicable. We derive a sample size by evaluating information gained from trials of varying sizes. For a binary outcome, we quantify how many adverse outcomes would be prevented by choosing the top-ranked treatment for each patient based on trial results rather than choosing a random treatment from the appropriate personalized randomization list. In simulations, we evaluate three performance measures: mean reduction in adverse outcomes using sample information, proportion of simulated patients for whom the top-ranked treatment performed as well or almost as well as the best appropriate treatment, and proportion of simulated trials in which the top-ranked treatment performed better than a randomly chosen treatment. We apply the methods to a trial evaluating eight different combination antibiotic regimens for neonatal sepsis (NeoSep1), in which a PRACTical design addresses varying patterns of antibiotic choice based on disease characteristics and resistance. Our proposed approach produces results that are more relevant to complex decision making by clinicians and policy makers., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

4. Calibrating machine learning approaches for probability estimation: A short expansion.

Author: Ojeda FM, Baker SG, and Ziegler A
Subjects: Humans, Calibration, Models, Statistical, Machine Learning, Probability
Published: 2024
Full Text: View/download PDF

5. Bayesian modeling of spatial ordinal data from health surveys.

Author: Beltrán-Sánchez MÁ, Martinez-Beneito MA, and Corberán-Vallet A
Subjects: Humans, Spain epidemiology, Likelihood Functions, Health Status Indicators, Small-Area Analysis, Spatial Analysis, Male, Female, Bayes Theorem, Health Surveys statistics & numerical data, Models, Statistical
Abstract: Health surveys allow exploring health indicators that are of great value from a public health point of view and that cannot normally be studied from regular health registries. These indicators are usually coded as ordinal variables and may depend on covariates associated with individuals. In this article, we propose a Bayesian individual-level model for small-area estimation of survey-based health indicators. A categorical likelihood is used at the first level of the model hierarchy to describe the ordinal data, and spatial dependence among small areas is taken into account by using a conditional autoregressive distribution. Post-stratification of the results of the proposed individual-level model allows extrapolating the results to any administrative areal division, even for small areas. We apply this methodology to describe the geographical distribution of a self-perceived health indicator from the Health Survey of the Region of Valencia (Spain) for the year 2016., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

6. Covariate-adjusted generalized pairwise comparisons in small samples.

Author: Jaspers S, Verbeeck J, and Thas O
Subjects: Humans, Sample Size, Data Interpretation, Statistical, Bias, Models, Statistical, Computer Simulation
Abstract: Semiparametric probabilistic index models allow for the comparison of two groups of observations, whilst adjusting for covariates, thereby fitting nicely within the framework of generalized pairwise comparisons (GPC). As with most regression approaches in this setting, the limited amount of data results in invalid inference as the asymptotic normality assumption is not met. In addition, separation issues might arise when considering small samples. In this article, we show that the parameters of the probabilistic index model can be estimated using generalized estimating equations, for which adjustments exist that lead to estimators of the sandwich variance-covariance matrix with improved finite sample properties and that can deal with bias due to separation. In this way, appropriate inference can be performed as is shown through extensive simulation studies. The known relationships between the probabilistic index and other GPC statistics allow to also provide valid inference for example, the net treatment benefit or the success odds., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

7. A latent variable approach to jointly modeling longitudinal and cumulative event data using a weighted two-stage method.

Author: Abbott MR, Nahum-Shani I, Lam CY, Potter LN, Wetter DW, and Dempsey WH
Subjects: Humans, Longitudinal Studies, Computer Simulation, Poisson Distribution, Smoking psychology, Smoking Cessation psychology, Ecological Momentary Assessment, Models, Statistical
Abstract: Ecological momentary assessment (EMA), a data collection method commonly employed in mHealth studies, allows for repeated real-time sampling of individuals' psychological, behavioral, and contextual states. Due to the frequent measurements, data collected using EMA are useful for understanding both the temporal dynamics in individuals' states and how these states relate to adverse health events. Motivated by data from a smoking cessation study, we propose a joint model for analyzing longitudinal EMA data to determine whether certain latent psychological states are associated with repeated cigarette use. Our method consists of a longitudinal submodel-a dynamic factor model-that models changes in the time-varying latent states and a cumulative risk submodel-a Poisson regression model-that connects the latent states with the total number of events. In the motivating data, both the predictors-the underlying psychological states-and the event outcome-the number of cigarettes smoked-are partially unobservable; we account for this incomplete information in our proposed model and estimation method. We take a two-stage approach to estimation that leverages existing software and uses importance sampling-based weights to reduce potential bias. We demonstrate that these weights are effective at reducing bias in the cumulative risk submodel parameters via simulation. We apply our method to a subset of data from a smoking cessation study to assess the association between psychological state and cigarette smoking. The analysis shows that above-average intensities of negative mood are associated with increased cigarette use., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

8. A Bayesian semi-parametric scalar-on-function regression with measurement error using instrumental variables.

Author: Zoh RS, Luan Y, Xue L, Allison DB, and Tekwe CD
Subjects: Humans, Computer Simulation, Models, Statistical, Regression Analysis, Obesity, Bias, Actigraphy methods, Actigraphy statistics & numerical data, Bayes Theorem, Exercise physiology, Body Mass Index
Abstract: Wearable devices such as the ActiGraph are now commonly used in research to monitor or track physical activity. This trend corresponds with the growing need to assess the relationships between physical activity and health outcomes, such as obesity, accurately. Device-based physical activity measures are best treated as functions when assessing their associations with scalar-valued outcomes such as body mass index. Scalar-on-function regression (SoFR) is a suitable regression model in this setting. Most estimation approaches in SoFR assume that the measurement error in functional covariates is white noise. Violating this assumption can lead to underestimating model parameters. There are limited approaches to correcting measurement errors for frequentist methods and none for Bayesian methods in this area. We present a non-parametric Bayesian measurement error-corrected SoFR model that relaxes all the constraining assumptions often involved with these models. Our estimation relies on an instrumental variable allowing a time-varying biasing factor, a significant departure from the current generalized method of moment (GMM) approach. Our proposed method also permits model-based grouping of the functional covariate following measurement error correction. This grouping of the measurement error-corrected functional covariate allows additional ease of interpretation of how the different groups differ. Our method is easy to implement, and we demonstrate its finite sample properties in extensive simulations. Finally, we applied our method to data from the National Health and Examination Survey to assess the relationship between wearable device-based measures of physical activity and body mass index in adults in the United States., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

9. A sequential, multiple assignment, randomized trial design with a tailoring function.

Author: Hartman H, Schipper M, and Kidwell K
Subjects: Humans, Research Design, Models, Statistical, Regression Analysis, Computer Simulation, Randomized Controlled Trials as Topic methods
Abstract: We present a trial design for sequential multiple assignment randomized trials (SMARTs) that use a tailoring function instead of a binary tailoring variable allowing for simultaneous development of the tailoring variable and estimation of dynamic treatment regimens (DTRs). We apply methods for developing DTRs from observational data: tree-based regression learning and Q-learning. We compare this to a balanced randomized SMART with equal re-randomization probabilities and a typical SMART design where re-randomization depends on a binary tailoring variable and DTRs are analyzed with weighted and replicated regression. This project addresses a gap in clinical trial methodology by presenting SMARTs where second stage treatment is based on a continuous outcome removing the need for a binary tailoring variable. We demonstrate that data from a SMART using a tailoring function can be used to efficiently estimate DTRs and is more flexible under varying scenarios than a SMART using a tailoring variable., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

10. An augmented illness-death model for semi-competing risks with clinically immediate terminal events.

Author: Reeder HT, Lee KH, Papatheodorou SI, and Haneuse S
Subjects: Humans, Pregnancy, Female, Risk Assessment methods, Computer Simulation, Bayes Theorem, Pre-Eclampsia epidemiology, Pre-Eclampsia mortality, Models, Statistical
Abstract: Preeclampsia is a pregnancy-associated condition posing risks of both fetal and maternal mortality and morbidity that can only resolve following delivery and removal of the placenta. Because in its typical form preeclampsia can arise before delivery, but not after, these two events exemplify the time-to-event setting of "semi-competing risks" in which a non-terminal event of interest is subject to the occurrence of a terminal event of interest. The semi-competing risks framework presents a valuable opportunity to simultaneously address two clinically meaningful risk modeling tasks: (i) characterizing risk of developing preeclampsia, and (ii) characterizing time to delivery after onset of preeclampsia. However, some people with preeclampsia deliver immediately upon diagnosis, while others are admitted and monitored for an extended period before giving birth, resulting in two distinct trajectories following the non-terminal event, which we call "clinically immediate" and "non-immediate" terminal events. Though such phenomena arise in many clinical contexts, to-date there have not been methods developed to acknowledge the complex dependencies between such outcomes, nor leverage these phenomena to gain new insight into individualized risk. We address this gap by proposing a novel augmented frailty-based illness-death model with a binary submodel to distinguish risk of immediate terminal event following the non-terminal event. The model admits direct dependence of the terminal event on the non-terminal event through flexible regression specification, as well as indirect dependence via a shared frailty term linking each submodel. We develop an efficient Bayesian sampler for estimation and corresponding model fit metrics, and derive formulae for dynamic risk prediction. In an extended example using pregnancy outcome data from an electronic health record, we demonstrate the proposed model's direct applicability to address a broad range of clinical questions., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

11. Exploiting relationship directionality to enhance statistical modeling of peer-influence across social networks.

Author: Ran X, Morden NE, Meara E, Moen EL, Rockmore DN, and O'Malley AJ
Subjects: Humans, United States, Peer Influence, Ohio, Practice Patterns, Physicians' statistics & numerical data, Medicare statistics & numerical data, Inappropriate Prescribing statistics & numerical data, Social Networking, Models, Statistical
Abstract: Risky-prescribing is the excessive or inappropriate prescription of drugs that singly or in combination pose significant risks of adverse health outcomes. In the United States, prescribing of opioids and other "risky" drugs is a national public health concern. We use a novel data framework-a directed network connecting physicians who encounter the same patients in a sequence of visits-to investigate if risky-prescribing diffuses across physicians through a process of peer-influence. Using a shared-patient network of 10 661 Ohio-based physicians constructed from Medicare claims data over 2014-2015, we extract information on the order in which patients encountered physicians to derive a directed patient-sharing network. This enables the novel decomposition of peer-effects of a medical practice such as risky-prescribing into directional (outbound and inbound) and bidirectional (mutual) relationship components. Using this framework, we develop models of peer-effects for contagion in risky-prescribing behavior as well as spillover effects. The latter is measured in terms of adverse health events suspected to be related to risky-prescribing in patients of peer-physicians. Estimated peer-effects were strongest when the patient-sharing relationship was mutual as opposed to directional. Using simulations we confirmed that our modeling and estimation strategies allows simultaneous estimation of each type of peer-effect (mutual and directional) with accuracy and precision. We also show that failing to account for these distinct mechanisms (a form of model mis-specification) produces misleading results, demonstrating the importance of retaining directional information in the construction of physician shared-patient networks. These findings suggest network-based interventions for reducing risky-prescribing., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

12. BHAFT: Bayesian heredity-constrained accelerated failure time models for detecting gene-environment interactions in survival analysis.

Author: Sun N, Chu J, He Q, Wang Y, Han Q, Yi N, Zhang R, and Shen Y
Subjects: Humans, Survival Analysis, Models, Statistical, Prognosis, Adenocarcinoma of Lung genetics, Adenocarcinoma of Lung mortality, Algorithms, Bayes Theorem, Gene-Environment Interaction, Lung Neoplasms genetics, Lung Neoplasms mortality, Computer Simulation
Abstract: In addition to considering the main effects, understanding gene-environment (G × E) interactions is imperative for determining the etiology of diseases and the factors that affect their prognosis. In the existing statistical framework for censored survival outcomes, there are several challenges in detecting G × E interactions, such as handling high-dimensional omics data, diverse environmental factors, and algorithmic complications in survival analysis. The effect heredity principle has widely been used in studies involving interaction identification because it incorporates the dependence of the main and interaction effects. However, Bayesian survival models that incorporate the assumption of this principle have not been developed. Therefore, we propose Bayesian heredity-constrained accelerated failure time (BHAFT) models for identifying main and interaction (M-I) effects with novel spike-and-slab or regularized horseshoe priors to incorporate the assumption of effect heredity principle. The R package rstan was used to fit the proposed models. Extensive simulations demonstrated that BHAFT models had outperformed other existing models in terms of signal identification, coefficient estimation, and prognosis prediction. Biologically plausible G × E interactions associated with the prognosis of lung adenocarcinoma were identified using our proposed model. Notably, BHAFT models incorporating the effect heredity principle could identify both main and interaction effects, which are highly useful in exploring G × E interactions in high-dimensional survival analysis. The code and data used in our paper are available at https://github.com/SunNa-bayesian/BHAFT., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

13. Bayesian survival analysis with INLA.

Author: Alvares D, van Niekerk J, Krainski ET, Rue H, and Rustand D
Subjects: Humans, Survival Analysis, Proportional Hazards Models, Computer Simulation, Longitudinal Studies, Software, Bayes Theorem, Models, Statistical
Abstract: This tutorial shows how various Bayesian survival models can be fitted using the integrated nested Laplace approximation in a clear, legible, and comprehensible manner using the INLA and INLAjoint R-packages. Such models include accelerated failure time, proportional hazards, mixture cure, competing risks, multi-state, frailty, and joint models of longitudinal and survival data, originally presented in the article "Bayesian survival analysis with BUGS." In addition, we illustrate the implementation of a new joint model for a longitudinal semicontinuous marker, recurrent events, and a terminal event. Our proposal aims to provide the reader with syntax examples for implementing survival models using a fast and accurate approximate Bayesian inferential approach., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

14. Robust inference methods for meta-analysis involving influential outlying studies.

Author: Noma H, Sugasawa S, and Furukawa TA
Subjects: Humans, Likelihood Functions, Computer Simulation, Data Interpretation, Statistical, Bias, Confidence Intervals, Meta-Analysis as Topic, Models, Statistical
Abstract: Meta-analysis is an essential tool to comprehensively synthesize and quantitatively evaluate results of multiple clinical studies in evidence-based medicine. In many meta-analyses, the characteristics of some studies might markedly differ from those of the others, and these outlying studies can generate biases and potentially yield misleading results. In this article, we provide effective robust statistical inference methods using generalized likelihoods based on the density power divergence. The robust inference methods are designed to adjust the influences of outliers through the use of modified estimating equations based on a robust criterion, even when multiple and serious influential outliers are present. We provide the robust estimators, statistical tests, and confidence intervals via the generalized likelihoods for the fixed-effect and random-effects models of meta-analysis. We also assess the contribution rates of individual studies to the robust overall estimators that indicate how the influences of outlying studies are adjusted. Through simulations and applications to two recently published systematic reviews, we demonstrate that the overall conclusions and interpretations of meta-analyses can be markedly changed if the robust inference methods are applied and that only the conventional inference methods might produce misleading evidence. These methods would be recommended to be used at least as a sensitivity analysis method in the practice of meta-analysis. We have also developed an R package, robustmeta, that implements the robust inference methods., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

15. Latent classification model for censored longitudinal binary outcome.

Author: Kuo JC, Chan W, Leon-Novelo L, Lairson DR, Brown A, and Fujimoto K
Subjects: Humans, Longitudinal Studies, Computer Simulation, Models, Statistical, Texas epidemiology, SARS-CoV-2, Female, COVID-19 epidemiology, Markov Chains, Latent Class Analysis, Algorithms
Abstract: Latent classification model is a class of statistical methods for identifying unobserved class membership among the study samples using some observed data. In this study, we proposed a latent classification model that takes a censored longitudinal binary outcome variable and uses its changing pattern over time to predict individuals' latent class membership. Assuming the time-dependent outcome variables follow a continuous-time Markov chain, the proposed method has two primary goals: (1) estimate the distribution of the latent classes and predict individuals' class membership, and (2) estimate the class-specific transition rates and rate ratios. To assess the model's performance, we conducted a simulation study and verified that our algorithm produces accurate model estimates (ie, small bias) with reasonable confidence intervals (ie, achieving approximately 95% coverage probability). Furthermore, we compared our model to four other existing latent class models and demonstrated that our approach yields higher prediction accuracies for latent classes. We applied our proposed method to analyze the COVID-19 data in Houston, Texas, US collected between January first 2021 and December 31st 2021. Early reports on the COVID-19 pandemic showed that the severity of a SARS-CoV-2 infection tends to vary greatly by cases. We found that while demographic characteristics explain some of the differences in individuals' experience with COVID-19, some unaccounted-for latent variables were associated with the disease., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

16. Addressing dispersion in mis-measured multivariate binomial outcomes: A novel statistical approach for detecting differentially methylated regions in bisulfite sequencing data.

Author: Zhao K, Oualkacha K, Zeng Y, Shen C, Klein K, Lakhal-Chaieb L, Labbe A, Pastinen T, Hudson M, Colmegna I, Bernatsky S, and Greenwood CMT
Subjects: Humans, Multivariate Analysis, Arthritis, Rheumatoid genetics, Likelihood Functions, Sulfites chemistry, Sequence Analysis, DNA methods, DNA Methylation, Algorithms, Computer Simulation, Models, Statistical
Abstract: Motivated by a DNA methylation application, this article addresses the problem of fitting and inferring a multivariate binomial regression model for outcomes that are contaminated by errors and exhibit extra-parametric variations, also known as dispersion. While dispersion in univariate binomial regression has been extensively studied, addressing dispersion in the context of multivariate outcomes remains a complex and relatively unexplored task. The complexity arises from a noteworthy data characteristic observed in our motivating dataset: non-constant yet correlated dispersion across outcomes. To address this challenge and account for possible measurement error, we propose a novel hierarchical quasi-binomial varying coefficient mixed model, which enables flexible dispersion patterns through a combination of additive and multiplicative dispersion components. To maximize the Laplace-approximated quasi-likelihood of our model, we further develop a specialized two-stage expectation-maximization (EM) algorithm, where a plug-in estimate for the multiplicative scale parameter enhances the speed and stability of the EM iterations. Simulations demonstrated that our approach yields accurate inference for smooth covariate effects and exhibits excellent power in detecting non-zero effects. Additionally, we applied our proposed method to investigate the association between DNA methylation, measured across the genome through targeted custom capture sequencing of whole blood, and levels of anti-citrullinated protein antibodies (ACPA), a preclinical marker for rheumatoid arthritis (RA) risk. Our analysis revealed 23 significant genes that potentially contribute to ACPA-related differential methylation, highlighting the relevance of cell signaling and collagen metabolism in RA. We implemented our method in the R Bioconductor package called "SOMNiBUS.", (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

17. Review of weighted exponential random graph models frameworks applied to neuroimaging.

Author: Fan Y and White SR
Subjects: Humans, Computer Simulation, Brain diagnostic imaging, Brain physiology, Magnetic Resonance Imaging methods, Neuroimaging methods, Models, Statistical
Abstract: Neuro-imaging data can often be represented as statistical networks, especially for functional magnetic resonance imaging (fMRI) data, where brain regions are defined as nodes and the functional interactions between those regions are taken as edges. Such networks are commonly divided into classes depending on the type of edges, namely binary or weighted. A binary network means edges can either be present or absent. Whereas the edges of a weighted network are associated with weight values, and fMRI networks belong to weighted networks. Statistical methods are often adopted to analyse such networks, among which, the exponential random graph model (ERGM) is an important network analysis approach. Typically ERGMs are applied to binary networks, and weighted networks often need to be binarised by arbitrarily selecting a threshold value to define the presence of the edges, which can lead to non-robustness and loss of valuable edge weight information representing the strength of fMRI interaction in fMRI networks. While it is therefore important to gain deeper insight in adopting ERGM on weighted networks, there only exists a few different ERGM frameworks for weighted networks; some of these are not directly implementable on fMRI networks based on their original proposal. We systematically review, implement, analyse and compare five such frameworks via a simulation study and provide guidelines on each modelling framework as well as conclude the suitability of them on fMRI networks based on a range of criteria. We concluded that Multi-Layered ERGM is currently the most suitable framework., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

18. A multivariate to multivariate approach for voxel-wise genome-wide association analysis.

Author: Wu Q, Zhang Y, Huang X, Ma T, Hong LE, Kochunov P, and Chen S
Subjects: Humans, Multivariate Analysis, White Matter diagnostic imaging, Connectome methods, Models, Statistical, Brain diagnostic imaging, Corpus Callosum diagnostic imaging, Genome-Wide Association Study methods, Polymorphism, Single Nucleotide, Computer Simulation, Algorithms
Abstract: The joint analysis of imaging-genetics data facilitates the systematic investigation of genetic effects on brain structures and functions with spatial specificity. We focus on voxel-wise genome-wide association analysis, which may involve trillions of single nucleotide polymorphism (SNP)-voxel pairs. We attempt to identify underlying organized association patterns of SNP-voxel pairs and understand the polygenic and pleiotropic networks on brain imaging traits. We propose a bi-clique graph structure (ie, a set of SNPs highly correlated with a cluster of voxels) for the systematic association pattern. Next, we develop computational strategies to detect latent SNP-voxel bi-cliques and an inference model for statistical testing. We further provide theoretical results to guarantee the accuracy of our computational algorithms and statistical inference. We validate our method by extensive simulation studies, and then apply it to the whole genome genetic and voxel-level white matter integrity data collected from 1052 participants of the human connectome project. The results demonstrate multiple genetic loci influencing white matter integrity measures on splenium and genu of the corpus callosum., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

19. Propensity score weighted multi-source exchangeability models for incorporating external control data in randomized clinical trials.

Author: Wei W, Zhang Y, and Roychoudhury S
Subjects: Humans, Data Interpretation, Statistical, Bias, Propensity Score, Randomized Controlled Trials as Topic methods, Models, Statistical, Computer Simulation
Abstract: Among clinical trialists, there has been a growing interest in using external data to improve decision-making and accelerate drug development in randomized clinical trials (RCTs). Here we propose a novel approach that combines the propensity score weighting (PW) and the multi-source exchangeability modelling (MEM) approaches to augment the control arm of a RCT in the rare disease setting. First, propensity score weighting is used to construct weighted external controls that have similar observed pre-treatment characteristics as the current trial population. Next, the MEM approach evaluates the similarity in outcome distributions between the weighted external controls and the concurrent control arm. The amount of external data we borrow is determined by the similarities in pretreatment characteristics and outcome distributions. The proposed approach can be applied to binary, continuous and count data. We evaluate the performance of the proposed PW-MEM method and several competing approaches based on simulation and re-sampling studies. Our results show that the PW-MEM approach improves the precision of treatment effect estimates while reducing the biases associated with borrowing data from external sources., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

20. Renewable risk assessment of heterogeneous streaming time-to-event cohorts.

Author: Ding J, Li J, and Wang X
Subjects: Humans, Risk Assessment methods, Cohort Studies, Models, Statistical, Time Factors, Lung Neoplasms, Computer Simulation
Abstract: The analysis of streaming time-to-event cohorts has garnered significant research attention. Most existing methods require observed cohorts from a study sequence to be independent and identically sampled from a common model. This assumption may be easily violated in practice. Our methodology operates within the framework of online data updating, where risk estimates for each cohort of interest are continuously refreshed using the latest observations and historical summary statistics. At each streaming stage, we introduce parameters to quantify the potential discrepancy between batch-specific effects from adjacent cohorts. We then employ penalized estimation techniques to identify nonzero discrepancy parameters, allowing us to adaptively adjust risk estimates based on current data and historical trends. We illustrate our proposed method through extensive empirical simulations and a lung cancer data analysis., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

21. A Bayesian non-stationary heteroskedastic time series model for multivariate critical care data.

Author: Omar Z, Stephens DA, Schmidt AM, and Buckeridge DL
Subjects: Humans, Multivariate Analysis, Algorithms, Computer Simulation, Quebec, Bayes Theorem, Markov Chains, Monte Carlo Method, Critical Care statistics & numerical data, Critical Care methods, Models, Statistical, Intensive Care Units
Abstract: We propose a multivariate GARCH model for non-stationary health time series by modifying the observation-level variance of the standard state space model. The proposed model provides an intuitive and novel way of dealing with heteroskedastic data using the conditional nature of state-space models. We follow the Bayesian paradigm to perform the inference procedure. In particular, we use Markov chain Monte Carlo methods to obtain samples from the resultant posterior distribution. We use the forward filtering backward sampling algorithm to efficiently obtain samples from the posterior distribution of the latent state. The proposed model also handles missing data in a fully Bayesian fashion. We validate our model on synthetic data and analyze a data set obtained from an intensive care unit in a Montreal hospital and the MIMIC dataset. We further show that our proposed models offer better performance, in terms of WAIC than standard state space models. The proposed model provides a new way to model multivariate heteroskedastic non-stationary time series data. Model comparison can then be easily performed using the WAIC., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

22. Simultaneous multi-transient linear-combination modeling of MRS data improves uncertainty estimation.

Author: Zöllner HJ, Davies-Jenkins C, Simicic D, Tal A, Sulam J, and Oeltzschner G
Subjects: Humans, Reproducibility of Results, Linear Models, Sensitivity and Specificity, Signal-To-Noise Ratio, gamma-Aminobutyric Acid metabolism, Models, Statistical, Magnetic Resonance Spectroscopy methods, Computer Simulation, Monte Carlo Method, Algorithms
Abstract: Purpose: The interest in applying and modeling dynamic MRS has recently grown. Two-dimensional modeling yields advantages for the precision of metabolite estimation in interrelated MRS data. However, it is unknown whether including all transients simultaneously in a 2D model without averaging (presuming a stable signal) performs similarly to one-dimensional (1D) modeling of the averaged spectrum. Therefore, we systematically investigated the accuracy, precision, and uncertainty estimation of both described model approaches., Methods: Monte Carlo simulations of synthetic MRS data were used to compare the accuracy and uncertainty estimation of simultaneous 2D multitransient linear-combination modeling (LCM) with 1D-LCM of the average. A total of 2,500 data sets per condition with different noise representations of a 64-transient MRS experiment at six signal-to-noise levels for two separate spin systems (scyllo-inositol and gamma-aminobutyric acid) were analyzed. Additional data sets with different levels of noise correlation were also analyzed. Modeling accuracy was assessed by determining the relative bias of the estimated amplitudes against the ground truth, and modeling precision was determined by SDs and Cramér-Rao lower bounds (CRLBs)., Results: Amplitude estimates for 1D- and 2D-LCM agreed well and showed a similar level of bias compared with the ground truth. Estimated CRLBs agreed well between both models and with ground-truth CRLBs. For correlated noise, the estimated CRLBs increased with the correlation strength for the 1D-LCM but remained stable for the 2D-LCM., Conclusion: Our results indicate that the model performance of 2D multitransient LCM is similar to averaged 1D-LCM. This validation on a simplified scenario serves as a necessary basis for further applications of 2D modeling., (© 2024 International Society for Magnetic Resonance in Medicine.)
Published: 2024
Full Text: View/download PDF

23. Generalized single index modeling of longitudinal data with multiple binary responses.

Author: Tian Z and Qiu P
Subjects: Longitudinal Studies, Humans, Linear Models, Computer Simulation, Data Interpretation, Statistical, Models, Statistical
Abstract: In health and clinical research, medical indices (eg, BMI) are commonly used for monitoring and/or predicting health outcomes of interest. While single-index modeling can be used to construct such indices, methods to use single-index models for analyzing longitudinal data with multiple correlated binary responses are underdeveloped, although there are abundant applications with such data (eg, prediction of multiple medical conditions based on longitudinally observed disease risk factors). This article aims to fill the gap by proposing a generalized single-index model that can incorporate multiple single indices and mixed effects for describing observed longitudinal data of multiple binary responses. Compared to the existing methods focusing on constructing marginal models for each response, the proposed method can make use of the correlation information in the observed data about different responses when estimating different single indices for predicting response variables. Estimation of the proposed model is achieved by using a local linear kernel smoothing procedure, together with methods designed specifically for estimating single-index models and traditional methods for estimating generalized linear mixed models. Numerical studies show that the proposed method is effective in various cases considered. It is also demonstrated using a dataset from the English Longitudinal Study of Aging project., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

24. deepAFT: A nonlinear accelerated failure time model with artificial neural network.

Author: Norman PA, Li W, Jiang W, and Chen BE
Subjects: Humans, Survival Analysis, Deep Learning, Models, Statistical, Neural Networks, Computer, Proportional Hazards Models, Algorithms, Computer Simulation, Nonlinear Dynamics
Abstract: The Cox regression model or accelerated failure time regression models are often used for describing the relationship between survival outcomes and potential explanatory variables. These models assume the studied covariates are connected to the survival time or its distribution or their transformations through a function of a linear regression form. In this article, we propose nonparametric, nonlinear algorithms (deepAFT methods) based on deep artificial neural networks to model survival outcome data in the broad distribution family of accelerated failure time models. The proposed methods predict survival outcomes directly and tackle the problem of censoring via an imputation algorithm as well as re-weighting and transformation techniques based on the inverse probabilities of censoring. Through extensive simulation studies, we confirm that the proposed deepAFT methods achieve accurate predictions. They outperform the existing regression models in prediction accuracy, while being flexible and robust in modeling covariate effects of various nonlinear forms. Their prediction performance is comparable to other established deep learning methods such as deepSurv and random survival forest methods. Even though the direct output is the expected survival time, the proposed AFT methods also provide predictions for distributional functions such as the cumulative hazard and survival functions without additional learning efforts. For situations where the popular Cox regression model may not be appropriate, the deepAFT methods provide useful and effective alternatives, as shown in simulations, and demonstrated in applications to a lymphoma clinical trial study., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

25. Assurance methods for designing a clinical trial with a delayed treatment effect.

Author: Salsbury JA, Oakley JE, Julious SA, and Hampson LV
Subjects: Humans, Sample Size, Models, Statistical, Neoplasms drug therapy, Neoplasms therapy, Clinical Trials, Phase III as Topic methods, Clinical Trials, Phase III as Topic statistics & numerical data, Clinical Trials as Topic methods, Computer Simulation, Antineoplastic Agents therapeutic use, Time Factors, Survival Analysis, Treatment Delay, Bayes Theorem, Research Design
Abstract: An assurance calculation is a Bayesian alternative to a power calculation. One may be performed to aid the planning of a clinical trial, specifically setting the sample size or to support decisions about whether or not to perform a study. Immuno-oncology is a rapidly evolving area in the development of anticancer drugs. A common phenomenon that arises in trials of such drugs is one of delayed treatment effects, that is, there is a delay in the separation of the survival curves. To calculate assurance for a trial in which a delayed treatment effect is likely to be present, uncertainty about key parameters needs to be considered. If uncertainty is not considered, the number of patients recruited may not be enough to ensure we have adequate statistical power to detect a clinically relevant treatment effect and the risk of an unsuccessful trial is increased. We present a new elicitation technique for when a delayed treatment effect is likely and show how to compute assurance using these elicited prior distributions. We provide an example to illustrate how this can be used in practice and develop open-source software to implement our methods. Our methodology has the potential to improve the success rate and efficiency of Phase III trials in immuno-oncology and for other treatments where a delayed treatment effect is expected to occur., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

26. Bayesian mixture modelling with ranked set samples.

Author: Alvandi A, Omidvar S, Hatefi A, Jafari Jozani M, Ozturk O, and Nematollahi N
Subjects: Humans, Female, Middle Aged, Aged, Computer Simulation, Monte Carlo Method, Likelihood Functions, Markov Chains, Bayes Theorem, Models, Statistical, Algorithms
Abstract: We consider the Bayesian estimation of the parameters of a finite mixture model from independent order statistics arising from imperfect ranked set sampling designs. As a cost-effective method, ranked set sampling enables us to incorporate easily attainable characteristics, as ranking information, into data collection and Bayesian estimation. To handle the special structure of the ranked set samples, we develop a Bayesian estimation approach exploiting the Expectation-Maximization (EM) algorithm in estimating the ranking parameters and Metropolis within Gibbs Sampling to estimate the parameters of the underlying mixture model. Our findings show that the proposed RSS-based Bayesian estimation method outperforms the commonly used Bayesian counterpart using simple random sampling. The developed method is finally applied to estimate the bone disorder status of women aged 50 and older., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

27. A multi-arm multi-stage platform design that allows preplanned addition of arms while still controlling the family-wise error.

Author: Greenstreet P, Jaki T, Bedding A, Harbron C, and Mozgunov P
Subjects: Humans, Sample Size, Computer Simulation, Models, Statistical, Clinical Trials as Topic methods, Randomized Controlled Trials as Topic methods, Randomized Controlled Trials as Topic statistics & numerical data, Research Design
Abstract: There is growing interest in platform trials that allow for adding of new treatment arms as the trial progresses as well as being able to stop treatments part way through the trial for either lack of benefit/futility or for superiority. In some situations, platform trials need to guarantee that error rates are controlled. This paper presents a multi-stage design, that allows additional arms to be added in a platform trial in a preplanned fashion, while still controlling the family-wise error rate, under the assumption of known number and timing of treatments to be added, and no time trends. A method is given to compute the sample size required to achieve a desired level of power and we show how the distribution of the sample size and the expected sample size can be found. We focus on power under the least favorable configuration which is the power of finding the treatment with a clinically relevant effect out of a set of treatments while the rest have an uninteresting treatment effect. A motivating trial is presented which focuses on two settings, with the first being a set number of stages per active treatment arm and the second being a set total number of stages, with treatments that are added later getting fewer stages. Compared to Bonferroni, the savings in the total maximum sample size are modest in a trial with three arms, <1% of the total sample size. However, the savings are more substantial in trials with more arms., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

28. Nonparametric empirical Bayes biomarker imputation and estimation.

Author: Barbehenn A and Zhao SD
Subjects: Humans, Models, Statistical, Statistics, Nonparametric, Data Interpretation, Statistical, Bayes Theorem, Biomarkers analysis, Computer Simulation
Abstract: Biomarkers are often measured in bulk to diagnose patients, monitor patient conditions, and research novel drug pathways. The measurement of these biomarkers often suffers from detection limits that result in missing and untrustworthy measurements. Frequently, missing biomarkers are imputed so that down-stream analysis can be conducted with modern statistical methods that cannot normally handle data subject to informative censoring. This work develops an empirical Bayes g $$ g $$ -modeling method for imputing and denoising biomarker measurements. We establish superior estimation properties compared to popular methods in simulations and with real data, providing the useful biomarker measurement estimations for down-stream analysis., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

29. Improved mortality analysis in early-phase dose-ranging clinical trials for emergency medical diseases using Bayesian time-to-event models with active comparators.

Author: Shi X, Wick JA, Martin RL, Beall J, Silbergleit R, Rockswold GL, Barsan WG, Korley FK, Rockswold S, and Gajewski BJ
Subjects: Humans, Computer Simulation, Randomized Controlled Trials as Topic, Brain Injuries, Traumatic mortality, Brain Injuries, Traumatic therapy, Brain Injuries, Traumatic drug therapy, Time Factors, Bayes Theorem, Models, Statistical, Clinical Trials, Phase II as Topic methods, Clinical Trials, Phase II as Topic statistics & numerical data
Abstract: Emergency medical diseases (EMDs) are the leading cause of death worldwide. A time-to-death analysis is needed to accurately identify the risks and describe the pattern of an EMD because the mortality rate can peak early and then decline. Dose-ranging Phase II clinical trials are essential for developing new therapies for EMDs. However, most dose-finding trials do not analyze mortality as a time-to-event endpoint. We propose three Bayesian dose-response time-to-event models for a secondary mortality analysis of a clinical trial: a two-group (active treatment vs control) model, a three-parameter sigmoid EMAX model, and a hierarchical EMAX model. The study also incorporates one specific active treatment as an active comparator in constructing three new models. We evaluated the performance of these six models and a very popular independent model using simulated data motivated by a randomized Phase II clinical trial focused on identifying the most effective hyperbaric oxygen dose to achieve favorable functional outcomes in patients with severe traumatic brain injury. The results show that the three-group, EMAX, and EMAX model with an active comparator produce the smallest averaged mean squared errors and smallest mean absolute biases. We provide a new approach for time-to-event analysis in early-phase dose-ranging clinical trials for EMDs. The EMAX model with an active comparator can provide valuable insights into the mortality analysis of new EMDs or other conditions that have changing risks over time. The restricted mean survival time, a function of the model's hazards, is recommended for displaying treatment effects for EMD research., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

30. A sparse factor model for clustering high-dimensional longitudinal data.

Author: Lu Z and Chandra NK
Subjects: Longitudinal Studies, Cluster Analysis, Humans, Computer Simulation, Bayes Theorem, Models, Statistical
Abstract: Recent advances in engineering technologies have enabled the collection of a large number of longitudinal features. This wealth of information presents unique opportunities for researchers to investigate the complex nature of diseases and uncover underlying disease mechanisms. However, analyzing such kind of data can be difficult due to its high dimensionality, heterogeneity and computational challenges. In this article, we propose a Bayesian nonparametric mixture model for clustering high-dimensional mixed-type (eg, continuous, discrete and categorical) longitudinal features. We employ a sparse factor model on the joint distribution of random effects and the key idea is to induce clustering at the latent factor level instead of the original data to escape the curse of dimensionality. The number of clusters is estimated through a Dirichlet process prior. An efficient Gibbs sampler is developed to estimate the posterior distribution of the model parameters. Analysis of real and simulated data is presented and discussed. Our study demonstrates that the proposed model serves as a useful analytical tool for clustering high-dimensional longitudinal data., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

31. Sensitivity analysis for principal ignorability violation in estimating complier and noncomplier average causal effects.

Author: Nguyen TQ, Stuart EA, Scharfstein DO, and Ogburn EL
Subjects: Humans, Models, Statistical, Data Interpretation, Statistical, Odds Ratio, Computer Simulation, Patient Compliance statistics & numerical data, Causality
Abstract: An important strategy for identifying principal causal effects (popular estimands in settings with noncompliance) is to invoke the principal ignorability (PI) assumption. As PI is untestable, it is important to gauge how sensitive effect estimates are to its violation. We focus on this task for the common one-sided noncompliance setting where there are two principal strata, compliers and noncompliers. Under PI, compliers and noncompliers share the same outcome-mean-given-covariates function under the control condition. For sensitivity analysis, we allow this function to differ between compliers and noncompliers in several ways, indexed by an odds ratio, a generalized odds ratio, a mean ratio, or a standardized mean difference sensitivity parameter. We tailor sensitivity analysis techniques (with any sensitivity parameter choice) to several types of PI-based main analysis methods, including outcome regression, influence function (IF) based and weighting methods. We discuss range selection for the sensitivity parameter. We illustrate the sensitivity analyses with several outcome types from the JOBS II study. This application estimates nuisance functions parametrically - for simplicity and accessibility. In addition, we establish rate conditions on nonparametric nuisance estimation for IF-based estimators to be asymptotically normal - with a view to inform nonparametric inference., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

32. Data fusion for predicting long-term program impacts.

Author: Robbins MW, Bauhoff S, and Burgette L
Subjects: Humans, Oregon, Computer Simulation, Mortality, Longitudinal Studies, United States, Models, Statistical, Insurance, Health statistics & numerical data
Abstract: Policymakers often require information on programs' long-term impacts that is not available when decisions are made. For example, while rigorous evidence from the Oregon Health Insurance Experiment (OHIE) shows that having health insurance influences short-term health and financial measures, the impact on long-term outcomes, such as mortality, will not be known for many years following the program's implementation. We demonstrate how data fusion methods may be used address the problem of missing final outcomes and predict long-run impacts of interventions before the requisite data are available. We implement this method by concatenating data on an intervention (such as the OHIE) with auxiliary long-term data and then imputing missing long-term outcomes using short-term surrogate outcomes while approximating uncertainty with replication methods. We use simulations to examine the performance of the methodology and apply the method in a case study. Specifically, we fuse data on the OHIE with data from the National Longitudinal Mortality Study and estimate that being eligible to apply for subsidized health insurance will lead to a statistically significant improvement in long-term mortality., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

33. Two-stage randomized clinical trials with a right-censored endpoint: Comparison of frequentist and Bayesian adaptive designs.

Author: Boumendil L, Chevret S, Lévy V, and Biard L
Subjects: Humans, Sample Size, Research Design, Endpoint Determination, Leukemia, Lymphocytic, Chronic, B-Cell drug therapy, Models, Statistical, Bayes Theorem, Randomized Controlled Trials as Topic statistics & numerical data, Computer Simulation
Abstract: Adaptive randomized clinical trials are of major interest when dealing with a time-to-event outcome in a prolonged observation window. No consensus exists either to define stopping boundaries or to combine p $$ p $$ values or test statistics in the terminal analysis in the case of a frequentist design and sample size adaptation. In a one-sided setting, we compared three frequentist approaches using stopping boundaries relying on α $$ \alpha $$ -spending functions and a Bayesian monitoring setting with boundaries based on the posterior distribution of the log-hazard ratio. All designs comprised a single interim analysis with an efficacy stopping rule and the possibility of sample size adaptation at this interim step. Three frequentist approaches were defined based on the terminal analysis: combination of stagewise statistics (Wassmer) or of p $$ p $$ values (Desseaux), or on patientwise splitting (Jörgens), and we compared the results with those of the Bayesian monitoring approach (Freedman). These different approaches were evaluated in a simulation study and then illustrated on a real dataset from a randomized clinical trial conducted in elderly patients with chronic lymphocytic leukemia. All approaches controlled for the type I error rate, except for the Bayesian monitoring approach, and yielded satisfactory power. It appears that the frequentist approaches are the best in underpowered trials. The power of all the approaches was affected by the violation of the proportional hazards (PH) assumption. For adaptive designs with a survival endpoint and a one-sided alternative hypothesis, the Wassmer and Jörgens approaches after sample size adaptation should be preferred, unless violation of PH is suspected., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

34. Non-parametric inference on calibration of predicted risks.

Author: Sadatsafavi M and Petkau J
Subjects: Humans, Risk Assessment methods, Myocardial Infarction mortality, Statistics, Nonparametric, Calibration, Probability, Models, Statistical, Computer Simulation
Abstract: Moderate calibration, the expected event probability among observations with predicted probability z being equal to z, is a desired property of risk prediction models. Current graphical and numerical techniques for evaluating moderate calibration of risk prediction models are mostly based on smoothing or grouping the data. As well, there is no widely accepted inferential method for the null hypothesis that a model is moderately calibrated. In this work, we discuss recently-developed, and propose novel, methods for the assessment of moderate calibration for binary responses. The methods are based on the limiting distributions of functions of standardized partial sums of prediction errors converging to the corresponding laws of Brownian motion. The novel method relies on well-known properties of the Brownian bridge which enables joint inference on mean and moderate calibration, leading to a unified "bridge" test for detecting miscalibration. Simulation studies indicate that the bridge test is more powerful, often substantially, than the alternative test. As a case study we consider a prediction model for short-term mortality after a heart attack, where we provide suggestions on graphical presentation and the interpretation of results. Moderate calibration can be assessed without requiring arbitrary grouping of data or using methods that require tuning of parameters., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

35. Conditional score approaches to errors-in-variables competing risks data in discrete time.

Author: Wen CC and Chen YH
Subjects: Humans, Survival Analysis, Algorithms, Models, Statistical, Regression Analysis, Risk Assessment methods, Scleroderma, Systemic, Computer Simulation, Proportional Hazards Models
Abstract: Analysis of competing risks data has been an important topic in survival analysis due to the need to account for the dependence among the competing events. Also, event times are often recorded on discrete time scales, rendering the models tailored for discrete-time nature useful in the practice of survival analysis. In this work, we focus on regression analysis with discrete-time competing risks data, and consider the errors-in-variables issue where the covariates are prone to measurement errors. Viewing the true covariate value as a parameter, we develop the conditional score methods for various discrete-time competing risks models, including the cause-specific and subdistribution hazards models that have been popular in competing risks data analysis. The proposed estimators can be implemented by efficient computation algorithms, and the associated large sample theories can be simply obtained. Simulation results show satisfactory finite sample performances, and the application with the competing risks data from the scleroderma lung study reveals the utility of the proposed methods., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

36. REDOMA: Bayesian random-effects dose-optimization meta-analysis using spike-and-slab priors.

Author: Yang CH, Kwiatkowski E, Lee JJ, and Lin R
Subjects: Humans, Neoplasms drug therapy, Meta-Analysis as Topic, Computer Simulation, Clinical Trials, Phase I as Topic methods, Antineoplastic Agents therapeutic use, Antineoplastic Agents administration & dosage, Clinical Trials, Phase II as Topic methods, Models, Statistical, Bayes Theorem, Dose-Response Relationship, Drug
Abstract: The rise of cutting-edge precision cancer treatments has led to a growing significance of the optimal biological dose (OBD) in modern oncology trials. These trials now prioritize the consideration of both toxicity and efficacy simultaneously when determining the most desirable dosage for treatment. Traditional approaches in early-phase oncology trials have conventionally relied on the assumption of a monotone relationship between treatment efficacy and dosage. However, this assumption may not hold valid for novel oncology therapies. In reality, the dose-efficacy curve of such treatments may reach a plateau at a specific dose, posing challenges for conventional methods in accurately identifying the OBD. Furthermore, achieving reliable identification of the OBD is typically not possible based on a single small-sample trial. With data from multiple phase I and phase I/II trials, we propose a novel Bayesian random-effects dose-optimization meta-analysis (REDOMA) approach to identify the OBD by synthesizing toxicity and efficacy data from each trial. The REDOMA method can address trials with heterogeneous characteristics. We adopt a curve-free approach based on a Gamma process prior to model the average dose-toxicity relationship. In addition, we utilize a Bayesian model selection framework that uses the spike-and-slab prior as an automatic variable selection technique to eliminate monotonic constraints on the dose-efficacy curve. The good performance of the REDOMA method is confirmed by extensive simulation studies., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

37. Bayesian transition models for ordinal longitudinal outcomes.

Author: Rohde MD, French B, Stewart TG, and Harrell FE Jr
Subjects: Humans, Longitudinal Studies, COVID-19 Drug Treatment, SARS-CoV-2, Bayes Theorem, Models, Statistical, COVID-19
Abstract: Ordinal longitudinal outcomes are becoming common in clinical research, particularly in the context of COVID-19 clinical trials. These outcomes are information-rich and can increase the statistical efficiency of a study when analyzed in a principled manner. We present Bayesian ordinal transition models as a flexible modeling framework to analyze ordinal longitudinal outcomes. We develop the theory from first principles and provide an application using data from the Adaptive COVID-19 Treatment Trial (ACTT-1) with code examples in R. We advocate that researchers use ordinal transition models to analyze ordinal longitudinal outcomes when appropriate alongside standard methods such as time-to-event modeling., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

38. Categorical linkage-data analysis.

Author: Zhang LC and Tuoto T
Subjects: Humans, Data Interpretation, Statistical, Probability, Medical Record Linkage methods, Models, Statistical, Computer Simulation
Abstract: Analysis of integrated data often requires record linkage in order to join together the data residing in separate sources. In case linkage errors cannot be avoided, due to the lack a unique identity key that can be used to link the records unequivocally, standard statistical techniques may produce misleading inference if the linked data are treated as if they were true observations. In this paper, we propose methods for categorical data analysis based on linked data that are not prepared by the analyst, such that neither the match-key variables nor the unlinked records are available. The adjustment is based on the proportion of false links in the linked file and our approach allows the probabilities of correct linkage to vary across the records without requiring that one is able to estimate this probability for each individual record. It accommodates also the general situation where unmatched records that cannot possibly be correctly linked exist in all the sources. The proposed methods are studied by simulation and applied to real data., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

39. Adding experimental treatment arms to multi-arm multi-stage platform trials in progress.

Author: Burnett T, König F, and Jaki T
Subjects: Humans, Computer Simulation, Clinical Trials as Topic methods, Models, Statistical, Research Design
Abstract: Multi-arm multi-stage (MAMS) platform trials efficiently compare several treatments with a common control arm. Crucially MAMS designs allow for adjustment for multiplicity if required. If for example, the active treatment arms in a clinical trial relate to different dose levels or different routes of administration of a drug, the strict control of the family-wise error rate (FWER) is paramount. Suppose a further treatment becomes available, it is desirable to add this to the trial already in progress; to access both the practical and statistical benefits of the MAMS design. In any setting where control of the error rate is required, we must add corresponding hypotheses without compromising the validity of the testing procedure.To strongly control the FWER, MAMS designs use pre-planned decision rules that determine the recruitment of the next stage of the trial based on the available data. The addition of a treatment arm presents an unplanned change to the design that we must account for in the testing procedure. We demonstrate the use of the conditional error approach to add hypotheses to any testing procedure that strongly controls the FWER. We use this framework to add treatments to a MAMS trial in progress. Simulations illustrate the possible characteristics of such procedures., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

40. Bayesian hierarchical profile regression for binary covariates.

Author: Beall J, Li H, Martin-Harris B, Neelon B, Elm J, Graboyes E, and Hill E
Subjects: Humans, Regression Analysis, Female, Models, Statistical, Male, Cluster Analysis, Bayes Theorem, Deglutition Disorders
Abstract: Dysphagia, a common result of other medical conditions, is caused by malfunctions in swallowing physiology resulting in difficulty eating and drinking. The Modified Barium Swallow Study (MBSS), the most commonly used diagnostic tool for evaluating dysphagia, can be assessed using the Modified Barium Swallow Impairment Profile (MBSImP™). The MBSImP assessment tool consists of a hierarchical grouped data structure with multiple domains, a set of components within each domain which characterize specific swallowing physiologies, and a set of tasks scored on a discrete scale within each component. We lack sophisticated approaches to extract patterns of physiologic swallowing impairment from the MBSImP task scores within a component while still recognizing the nested structure of components within a domain. We propose a Bayesian hierarchical profile regression model, which uses a Bayesian profile regression model in conjunction with a hierarchical Dirichlet process mixture model to (1) cluster subjects into impairment profile patterns while respecting the hierarchical grouped data structure of the MBSImP, and (2) simultaneously determine associations between latent profile cluster membership for all components and the outcome of dysphagia severity. We apply our approach to a cohort of patients referred for an MBSS and assessed using the MBSImP. Our research results can be used to inform appropriate intervention strategies, and provide tools for clinicians to make better multidimensional management and treatment decisions for patients with dysphagia., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

41. Familywise error for multiple time-to-event endpoints in a group sequential design.

Author: Thomsen HF, Lausvig NL, Pipper CB, Andersen S, Damgaard LH, Emerson SS, and Ravn H
Subjects: Humans, Research Design, Models, Statistical, Proportional Hazards Models, Data Interpretation, Statistical, Computer Simulation, Endpoint Determination methods
Abstract: We investigate the familywise error rate (FWER) for time-to-event endpoints evaluated using a group sequential design with a hierarchical testing procedure for secondary endpoints. We show that, in this setup, the correlation between the log-rank test statistics at interim and at end of study is not congruent with the canonical correlation derived for normal-distributed endpoints. We show, both theoretically and by simulation, that the correlation also depends on the level of censoring, the hazard rates of the endpoints, and the hazard ratio. To optimize operating characteristics in this complex scenario, we propose a simulation-based method to assess the FWER which, better than the alpha-spending approach, can inform the choice of critical values for testing secondary endpoints., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

42. Model-based bioequivalence approach for sparse pharmacokinetic bioequivalence studies: Model selection or model averaging?

Author: Philipp M, Tessier A, Donnelly M, Fang L, Feng K, Zhao L, Grosser S, Sun G, Sun W, Mentré F, and Bertrand J
Subjects: Humans, Pharmacokinetics, Therapeutic Equivalency, Computer Simulation, Cross-Over Studies, Models, Statistical
Abstract: Conventional pharmacokinetic (PK) bioequivalence (BE) studies aim to compare the rate and extent of drug absorption from a test (T) and reference (R) product using non-compartmental analysis (NCA) and the two one-sided test (TOST). Recently published regulatory guidance recommends alternative model-based (MB) approaches for BE assessment when NCA is challenging, as for long-acting injectables and products which require sparse PK sampling. However, our previous research on MB-TOST approaches showed that model misspecification can lead to inflated type I error. The objective of this research was to compare the performance of model selection (MS) on R product arm data and model averaging (MA) from a pool of candidate structural PK models in MBBE studies with sparse sampling. Our simulation study was inspired by a real case BE study using a two-way crossover design. PK data were simulated using three structural models under the null hypothesis and one model under the alternative hypothesis. MB-TOST was applied either using each of the five candidate models or following MS and MA with or without the simulated model in the pool. Assuming T and R have the same PK model, our simulation shows that following MS and MA, MB-TOST controls type I error rates at or below 0.05 and attains similar or even higher power than when using the simulated model. Thus, we propose to use MS prior to MB-TOST for BE studies with sparse PK sampling and to consider MA when candidate models have similar Akaike information criterion., (© 2024 Servier and The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd. This article has been contributed to by U.S. Government employees and their work is in the public domain in the USA.)
Published: 2024
Full Text: View/download PDF

43. A Bayesian method to detect drug-drug interaction using external information for spontaneous reporting system.

Author: Tada K, Maruo K, and Gosho M
Subjects: Humans, Drug-Related Side Effects and Adverse Reactions, Databases, Factual, Models, Statistical, United States, Bayes Theorem, Drug Interactions, Adverse Drug Reaction Reporting Systems statistics & numerical data, Computer Simulation
Abstract: Due to the insufficiency of safety assessments of clinical trials for drugs, further assessments are required for post-marketed drugs. In addition to adverse drug reactions (ADRs) induced by one drug, drug-drug interaction (DDI)-induced ADR should also be investigated. The spontaneous reporting system (SRS) is a powerful tool for evaluating the safety of drugs continually. In this study, we propose a novel Bayesian method for detecting potential DDIs in a database collected by the SRS. By applying a power prior, the proposed method can borrow information from similar drugs for a drug assessed DDI to increase sensitivity of detection. The proposed method can also adjust the amount of the information borrowed by tuning the parameters in power prior. In the simulation study, we demonstrate the aforementioned increase in sensitivity. Depending on the scenarios, approximately 20 points of sensitivity of the proposed method increase from an existing method to a maximum. We also indicate the possibility of early detection of potential DDIs by the proposed method through analysis of the database shared by the Food and Drug Administration. In conclusion, the proposed method has a higher sensitivity and a novel criterion to detect potential DDIs early, provided similar drugs have similar observed-expected ratios to the drug under assessment., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

44. A Dynamic Prognostic Model for Identifying Vulnerable COVID-19 Patients at High Risk of Rapid Deterioration.

Author: Anand P, D'Andrea E, Feldman W, Wang SV, Liu J, Brill G, DiCesare E, and Lin KJ
Subjects: Humans, Female, Male, Middle Aged, Prognosis, Aged, Massachusetts epidemiology, Electronic Health Records statistics & numerical data, Clinical Deterioration, Cohort Studies, Hospitalization statistics & numerical data, Severity of Illness Index, COVID-19 Vaccines administration & dosage, Models, Statistical, Adult, Risk Assessment, COVID-19 epidemiology, COVID-19 prevention & control, COVID-19 diagnosis
Abstract: Purpose: We aimed to validate and, if performance was unsatisfactory, update the previously published prognostic model to predict clinical deterioration in patients hospitalized for COVID-19, using data following vaccine availability., Methods: Using electronic health records of patients ≥18 years, with laboratory-confirmed COVID-19, from a large care-delivery network in Massachusetts, USA, from March 2020 to November 2021, we tested the performance of the previously developed prediction model and updated the prediction model by incorporating data after availability of COVID-19 vaccines. We randomly divided data into development (70%) and validation (30%) cohorts. We built a model predicting worsening in a published severity scale in 24 h by LASSO regression and evaluated performance by c-statistic and Brier score., Results: Our study cohort consisted of 8185 patients (Development: 5730 patients [mean age: 62; 44% female] and Validation: 2455 patients [mean age: 62; 45% female]). The previously published model had suboptimal performance using data after November 2020 (N = 4973, c-statistic = 0.60. Brier score = 0.11). After retraining with the new data, the updated model included 38 predictors including 18 changing biomarkers. Patients hospitalized after Jun 1st, 2021 (when COVID-19 vaccines became widely available in Massachusetts) were younger and had fewer comorbidities than those hospitalized before. The c-statistic and Brier score were 0.77 and 0.13 in the development cohort, and 0.73 and 0.14 in the validation cohort., Conclusion: The characteristics of patients hospitalized for COVID-19 differed substantially over time. We developed a new dynamic model for rapid progression with satisfactory performance in the validation set., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

45. Full random effects models (FREM): A practical usage guide.

Author: Jonsson EN and Nyberg J
Subjects: Humans, Drug Development methods, Data Interpretation, Statistical, Models, Statistical
Abstract: The full random-effects model (FREM) is an innovative and relatively novel covariate modeling technique. It differs from other covariate modeling approaches in that it treats covariates as observations and captures their impact on model parameters using their covariances. These unique characteristics mean that FREM is insensitive to correlations between covariates and implicitly handles missing covariate data. In practice, this implies that covariates are less likely to be excluded from the modeling scope in light of the observed data. FREM has been shown to be a useful modeling method for small datasets, but its pre-specification properties make it a very compelling modeling choice for late-stage phases of drug development. The present tutorial aims to explain what FREM models are and how they can be used in practice., (© 2024 Pharmetheus AB. CPT: Pharmacometrics & Systems Pharmacology published by Wiley Periodicals LLC on behalf of American Society for Clinical Pharmacology and Therapeutics.)
Published: 2024
Full Text: View/download PDF

46. An enhanced cross-sectional HIV incidence estimator that incorporates prior HIV test results.

Author: Bannick M, Donnell D, Hayes R, Laeyendecker O, and Gao F
Subjects: Humans, Incidence, Cross-Sectional Studies, Computer Simulation, Models, Statistical, Male, Randomized Controlled Trials as Topic, HIV Testing statistics & numerical data, Female, Sensitivity and Specificity, HIV Infections epidemiology, Algorithms
Abstract: Incidence estimation of HIV infection can be performed using recent infection testing algorithm (RITA) results from a cross-sectional sample. This allows practitioners to understand population trends in the HIV epidemic without having to perform longitudinal follow-up on a cohort of individuals. The utility of the approach is limited by its precision, driven by the (low) sensitivity of the RITA at identifying recent infection. By utilizing results of previous HIV tests that individuals may have taken, we consider an enhanced RITA with increased sensitivity (and specificity). We use it to propose an enhanced estimator for incidence estimation. We prove the theoretical properties of the enhanced estimator and illustrate its numerical performance in simulation studies. We apply the estimator to data from a cluster-randomized trial to study the effect of community-level HIV interventions on HIV incidence. We demonstrate that the enhanced estimator provides a more precise estimate of HIV incidence compared to the standard estimator., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

47. Detecting responsible nodes in differential Bayesian networks.

Author: Huang X and Zhang H
Subjects: Humans, Computer Simulation, Bayes Theorem, Models, Statistical
Abstract: To study the roles that different nodes play in differentiating Bayesian networks under two states, such as control versus disease, we formulate two node-specific scores to facilitate such assessment. The first score is motivated by the prediction invariance property of a causal model. The second score results from modifying an existing score constructed for differential analysis of undirected networks. We develop strategies based on these scores to identify nodes responsible for topological differences between two Bayesian networks. Synthetic data and real-life data from designed experiments are used to demonstrate the efficacy of the proposed methods in detecting responsible nodes., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

48. Balancing versus modelling in weighted analysis of non-randomised studies with survival outcomes: A simulation study.

Author: Filla T, Schwender H, and Kuss O
Subjects: Humans, Survival Analysis, Computer Simulation, Propensity Score, Models, Statistical
Abstract: Weighting methods are widely used for causal effect estimation in non-randomised studies. In general, these methods use the propensity score (PS), the probability of receiving the treatment given the covariates, to arrive at the respective weights. All of these "modelling" methods actually optimize prediction of the respective outcome, which is, in the PS model, treatment assignment. However, this does not match with the actual aim of weighting, which is eliminating the association between covariates and treatment assignment. In the "balancing" approach, covariates are thus balanced directly by solving systems of numerical equations, explicitly without fitting a PS model. To compare modelling, balancing and hybrid approaches to weighting we performed a large simulation study for a binary treatment and a survival outcome. For maximal practical relevance all simulation parameters were selected after a systematic review of medical studies that used PS methods for analysis. We also introduce a new hybrid method that uses the idea of the covariate balancing propensity score and matching weights, thus avoiding extreme weights. In addition, we present a corrected robust variance estimator for some of the methods. Overall, our simulations results indicate that balancing approach methods work worse than expected. However, among the considered balancing methods, entropy balancing consistently outperforms the variance balancing approach. All methods estimating the average treatment effect in the overlap population perform well with very little bias and small standard errors even in settings with misspecified propensity score models. Finally, the coverage using the standard robust variance estimator was too high for all methods, with the proposed corrected robust variance estimator improving coverage in a variety of settings., (© 2024 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

49. Modeling intra-individual inter-trial EEG response variability in autism.

Author: Dong M, Telesca D, Guindani M, Sugar C, Webb SJ, Jeste S, Dickinson A, Levin AR, Shic F, Naples A, Faja S, Dawson G, McPartland JC, and Şentürk D
Subjects: Humans, Autistic Disorder physiopathology, Models, Statistical, Computer Simulation, Nonlinear Dynamics, Brain physiopathology, Electroencephalography, Autism Spectrum Disorder physiopathology
Abstract: Autism spectrum disorder (autism) is a prevalent neurodevelopmental condition characterized by early emerging impairments in social behavior and communication. EEG represents a powerful and non-invasive tool for examining functional brain differences in autism. Recent EEG evidence suggests that greater intra-individual trial-to-trial variability across EEG responses in stimulus-related tasks may characterize brain differences in autism. Traditional analysis of EEG data largely focuses on mean trends of the trial-averaged data, where trial-level analysis is rarely performed due to low neural signal to noise ratio. We propose to use nonlinear (shape-invariant) mixed effects (NLME) models to study intra-individual inter-trial EEG response variability using trial-level EEG data. By providing more precise metrics of response variability, this approach could enrich our understanding of neural disparities in autism and potentially aid the identification of objective markers. The proposed multilevel NLME models quantify variability in the signal's interpretable and widely recognized features (e.g., latency and amplitude) while also regularizing estimation based on noisy trial-level data. Even though NLME models have been studied for more than three decades, existing methods cannot scale up to large data sets. We propose computationally feasible estimation and inference methods via the use of a novel minorization-maximization (MM) algorithm. Extensive simulations are conducted to show the efficacy of the proposed procedures. Applications to data from a large national consortium find that children with autism have larger intra-individual inter-trial variability in P1 latency in a visual evoked potential (VEP) task, compared to their neurotypical peers., (© 2024 John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

50. Fractional accumulative calibration-free odds (f-aCFO) design for delayed toxicity in phase I clinical trials.

Author: Fang J and Yin G
Subjects: Humans, Research Design, Dose-Response Relationship, Drug, Calibration, Drug-Related Side Effects and Adverse Reactions, Models, Statistical, Time Factors, Clinical Trials, Phase I as Topic methods, Computer Simulation
Abstract: The calibration-free odds (CFO) design has been demonstrated to be robust, model-free, and practically useful but faces challenges when dealing with late-onset toxicity. The emergence of the time-to-event (TITE) method and fractional method leads to the development of TITE-CFO and fractional CFO (fCFO) designs to accumulate delayed toxicity. Nevertheless, existing CFO-type designs have untapped potential because they primarily consider dose information from the current position and its two neighboring positions. To incorporate information from all doses, we propose the accumulative CFO (aCFO) design by utilizing data at all dose levels similar to a tug-of-war game where players distant from the center also contribute their strength. This approach enhances full information utilization while still preserving the model-free and calibration-free characteristics. Extensive simulation studies demonstrate performance improvement over the original CFO design, emphasizing the advantages of incorporating information from a broader range of dose levels. Furthermore, we propose to incorporate late-onset outcomes into the TITE-aCFO and f-aCFO designs, with f-aCFO displaying superior performance over existing methods in both fixed and random simulation scenarios. In conclusion, the aCFO and f-aCFO designs can be considered robust, efficient, and user-friendly approaches for conducting phase I trials without or with late-onsite toxicity., (© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

11,363 results on '"Models, Statistical"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources