4,230 results
Search Results
2. Four papers on child growth modelling
- Author
-
Louise Ryan
- Subjects
Statistics and Probability ,Medical education ,Special Issue Papers ,Epidemiology ,Statistics & Probability ,Special Issue Paper ,Global health ,MEDLINE ,Child growth ,Psychology ,Child health ,Introductory Journal Article - Published
- 2019
3. Reflections on the 1962 Paper “The Statistician in Medicine” by Sir Austin Bradford Hill
- Author
-
Lin, Xihong, primary
- Published
- 2020
- Full Text
- View/download PDF
4. Four papers on child growth modelling.
- Author
-
Ryan L
- Subjects
- Child, Child Health, Global Health, Humans, Child Development physiology, Growth physiology
- Published
- 2019
- Full Text
- View/download PDF
5. On the paper 'Notes on the overlap measure as an alternative to the Youden index'
- Author
-
Pablo Martinez Camblor and SMABSS RG
- Subjects
Statistics and Probability ,010104 statistics & probability ,Epidemiology ,0103 physical sciences ,Statistics ,Youden's J statistic ,Measure (physics) ,0101 mathematics ,010303 astronomy & astrophysics ,01 natural sciences ,Mathematics - Published
- 2018
- Full Text
- View/download PDF
6. Comments on the three papers by the FDA/CDER research team on the regulatory perspective of the missing data problem
- Author
-
Weichung Joe Shih
- Subjects
Statistics and Probability ,Research design ,United States Food and Drug Administration ,Epidemiology ,Research ,Statistics as Topic ,Perspective (graphical) ,Missing data problem ,01 natural sciences ,Data science ,United States ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Research Design ,Estimand ,Data Interpretation, Statistical ,Humans ,030212 general & internal medicine ,0101 mathematics ,Psychology - Abstract
This communication comments on the three papers by the FDA CDER research team on the regulatory perspective of the missing data problem. The focus is on two topics: causal estimand and sensitivity analysis. Copyright © 2016 John Wiley & Sons, Ltd.
- Published
- 2016
- Full Text
- View/download PDF
7. On the paper “Notes on the overlap measure as an alternative to the Youden index”
- Author
-
Martínez-Camblor, P., primary
- Published
- 2018
- Full Text
- View/download PDF
8. Comments on the three papers by the FDA/CDER research team on the regulatory perspective of the missing data problem
- Author
-
Shih, Weichung Joe, primary
- Published
- 2016
- Full Text
- View/download PDF
9. Reflections on the 1962 Paper "The Statistician in Medicine" by Sir Austin Bradford Hill.
- Author
-
Lin, Xihong
- Subjects
CAUSAL inference ,STATISTICIANS ,DATA science ,BIG data - Abstract
This article provides reflections on the 1962 paper by Sir Austin Bradford Hill, entitled "The Statistician in Medicine." It discusses several key takeaways of this paper, including causal inference for big data, reproducibility and replicability in science, and integration of statistics and data science with domain science. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
10. Design and other methodological considerations for the construction of human fetal and neonatal size and growth charts
- Author
-
Douglas G. Altman and Eric O Ohuma
- Subjects
Male ,Statistics and Probability ,Epidemiology ,Computer science ,design ,Gestational Age ,01 natural sciences ,Fetal Development ,neonatal ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Chart ,Pregnancy ,Special Issue Paper ,Humans ,Longitudinal Studies ,030212 general & internal medicine ,0101 mathematics ,growth charts ,Estimation ,Special Issue Papers ,Management science ,Infant, Newborn ,Quality control ,Anthropometry ,fetal ,Identification (information) ,Cross-Sectional Studies ,Systematic review ,Research Design ,methodological considerations ,Sample size determination ,Sample Size ,Inclusion and exclusion criteria ,Female - Abstract
This paper discusses the features of study design and methodological considerations for constructing reference centile charts for attained size, growth, and velocity charts with a focus on human growth charts used during pregnancy. Recent systematic reviews of pregnancy dating, fetal size, and newborn size charts showed that many studies aimed at constructing charts are still conducted poorly. Important design features such as inclusion and exclusion criteria, ultrasound quality control measures, sample size determination, anthropometric evaluation, gestational age estimation, assessment of outliers, and chart presentation are seldom well addressed, considered, or reported. Many of these charts are in clinical use today and directly affect the identification of at‐risk newborns that require treatment and nutritional strategies. This paper therefore reiterates some of the concepts previously identified as important for growth studies, focusing on considerations and concepts related to study design, sample size, and methodological considerations with an aim of obtaining valid reference or standard centile charts. We discuss some of the key issues and provide more details and practical examples based on our experiences from the INTERGROWTH‐21st Project. We discuss the statistical methodology and analyses for cross‐sectional studies and longitudinal studies in a separate article in this issue.
- Published
- 2018
- Full Text
- View/download PDF
11. Statistical methodology for constructing gestational age‐related charts using cross‐sectional and longitudinal data: The INTERGROWTH‐21st project as a case study
- Author
-
Eric O Ohuma and Douglas G. Altman
- Subjects
Statistics and Probability ,Special Issue Papers ,longitudinal ,Epidemiology ,Computer science ,Longitudinal data ,Gestational age ,Statistical model ,01 natural sciences ,Regression ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Statistics ,Special Issue Paper ,statistical methodology ,Intergrowth 21st ,Fetal head ,030212 general & internal medicine ,Model choice ,cross‐sectional ,0101 mathematics ,Raw data ,human growth - Abstract
Most studies aiming to construct reference or standard charts use a cross‐sectional design, collecting one measurement per participant. Reference or standard charts can also be constructed using a longitudinal design, collecting multiple measurements per participant. The choice of appropriate statistical methodology is important as inaccurate centiles resulting from inferior methods can lead to incorrect judgements about fetal or newborn size, resulting in suboptimal clinical care. Reference or standard centiles should ideally provide the best fit to the data, change smoothly with age (eg, gestational age), use as simple a statistical model as possible without compromising model fit, and allow the computation of Z‐scores from centiles to simplify assessment of individuals and enable comparison with different populations. Significance testing and goodness‐of‐fit statistics are usually used to discriminate between models. However, these methods tend not to be useful when examining large data sets as very small differences are statistically significant even if the models are indistinguishable on actual centile plots. Choosing the best model from amongst many is therefore not trivial. Model choice should not be based on statistical considerations (or tests) alone as sometimes the best model may not necessarily offer the best fit to the raw data across gestational age. In this paper, we describe the most commonly applied methodologies available for the construction of age‐specific reference or standard centiles for cross‐sectional and longitudinal data: Fractional polynomial regression, LMS, LMST, LMSP, and multilevel regression methods. For illustration, we used data from the INTERGROWTH‐21st Project, ie, newborn weight (cross‐sectional) and fetal head circumference (longitudinal) data as examples.
- Published
- 2018
12. Estimation of non‐monotonic transition rates in a semi‐Markov process with covariates adjustments and application to caregivers' stress data.
- Author
-
Ngan, Esther, Chan, Wenyaw, Leon‐Novelo, Luis, and Pavlik, Valory
- Subjects
CAREGIVERS ,DISTRIBUTION (Probability theory) ,ALZHEIMER'S disease ,OLDER people ,HAZARD function (Statistics) - Abstract
With the large ongoing number of aged people and Alzheimer's disease (AD) patients worldwide, unpaid caregivers have become the primary sources of their daily caregiving. Alzheimer's family caregivers often suffer from physical and mental morbidities owing to various reasons. The aims of this paper were to develop alternate methods to understand the transition properties, the dynamic change, and the long‐run behavior of AD caregivers' stress levels, by assuming their transition to the next level only depends on the duration of the current stress level. In this paper, we modeled the transition rates in the semi‐Markov Process with log‐logistic hazard functions. We assumed the transition rates were non‐monotonic over time and the scale of transition rates depended on covariates. We also extended the uniform accelerated expansion to calculate the long‐run probability distribution of stress levels while adjusting for multiple covariates. The proposed methods were evaluated through an empirical study. The application results showed that all the transition rates of caregivers' stress levels were right skewed. Care recipients' baseline age was significantly associated with the transitions. The long‐run probability of severe state was slightly higher, implying a prolonged recovery time for severe stress patients. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Inference on tree‐structured subgroups with subgroup size and subgroup effect relationship in clinical trials.
- Author
-
Luo, Yuanhui and Guo, Xinzhou
- Subjects
CLINICAL trials ,PANITUMUMAB ,INFERENTIAL statistics - Abstract
When multiple candidate subgroups are considered in clinical trials, we often need to make statistical inference on the subgroups simultaneously. Classical multiple testing procedures might not lead to an interpretable and efficient inference on the subgroups as they often fail to take subgroup size and subgroup effect relationship into account. In this paper, built on the selective traversed accumulation rules (STAR), we propose a data‐adaptive and interactive multiple testing procedure for subgroups which can take subgroup size and subgroup effect relationship into account under prespecified tree structure. The proposed method is easy‐to‐implement and can lead to a more interpretable and efficient inference on prespecified tree‐structured subgroups. Possible accommodations to post hoc identified tree‐structure subgroups are also discussed in the paper. We demonstrate the merit of our proposed method by re‐analyzing the panitumumab trial with the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. The person‐time ratio distribution for the exact monitoring of adverse events: Historical vs surveillance Poisson data.
- Author
-
Silva, Ivair R. and Montalban, Joselito
- Subjects
DISTRIBUTION (Probability theory) ,VACCINE safety ,POISSON distribution ,STATISTICAL hypothesis testing ,RANDOM variables - Abstract
In the postmarket drug and vaccine safety surveillance, when the number of adverse events follows a Poisson distribution, the ratio between the exposed and the unexposed person‐time information is the random variable that governs the decision rule about the safety of the drug or vaccine. The probability distribution function of such a ratio is derived in this paper. Exact point and interval estimators for the relative risk are discussed as well as statistical hypothesis testing. To the best of our knowledge, this is the first paper that provides an unbiased estimator for the relative risk based on the person‐time ratio. The applicability of this new distribution is illustrated through a real data analysis aimed to detect increased risk of occurrence of Myocarditis/Pericarditis following mRNA COVID‐19 vaccination in Manitoba, Canada. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Joint modelling of longitudinal and survival data: incorporating delayed entry and an assessment of model misspecification
- Author
-
Keith R. Abrams, Therese M.-L. Andersson, Keith Humphreys, Michael J. Crowther, and Paul C. Lambert
- Subjects
Statistics and Probability ,Mixed model ,delayed entry ,Epidemiology ,Computer science ,Computation ,Breast Neoplasms ,Biostatistics ,01 natural sciences ,adaptive Gauss–Hermite quadrature ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Special Issue Paper ,mixed effects ,Econometrics ,Humans ,Computer Simulation ,Longitudinal Studies ,030212 general & internal medicine ,0101 mathematics ,Breast Density ,Proportional Hazards Models ,Parametric statistics ,Likelihood Functions ,Models, Statistical ,Special Issue Papers ,Proportional hazards model ,joint modelling ,Random effects model ,Survival Analysis ,Quadrature (mathematics) ,Numerical integration ,left truncation ,Female ,Adaptive quadrature ,Algorithm - Abstract
A now common goal in medical research is to investigate the inter‐relationships between a repeatedly measured biomarker, measured with error, and the time to an event of interest. This form of question can be tackled with a joint longitudinal‐survival model, with the most common approach combining a longitudinal mixed effects model with a proportional hazards survival model, where the models are linked through shared random effects. In this article, we look at incorporating delayed entry (left truncation), which has received relatively little attention. The extension to delayed entry requires a second set of numerical integration, beyond that required in a standard joint model. We therefore implement two sets of fully adaptive Gauss–Hermite quadrature with nested Gauss–Kronrod quadrature (to allow time‐dependent association structures), conducted simultaneously, to evaluate the likelihood. We evaluate fully adaptive quadrature compared with previously proposed non‐adaptive quadrature through a simulation study, showing substantial improvements, both in terms of minimising bias and reducing computation time. We further investigate, through simulation, the consequences of misspecifying the longitudinal trajectory and its impact on estimates of association. Our scenarios showed the current value association structure to be very robust, compared with the rate of change that we found to be highly sensitive showing that assuming a simpler trend when the truth is more complex can lead to substantial bias. With emphasis on flexible parametric approaches, we generalise previous models by proposing the use of polynomials or splines to capture the longitudinal trend and restricted cubic splines to model the baseline log hazard function. The methods are illustrated on a dataset of breast cancer patients, modelling mammographic density jointly with survival, where we show how to incorporate density measurements prior to the at‐risk period, to make use of all the available information. User‐friendly Stata software is provided. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
- Published
- 2015
- Full Text
- View/download PDF
16. Review and evaluation of penalised regression methods for risk prediction in low‐dimensional data with few events
- Author
-
Shaun R. Seaman, Menelaos Pavlou, Maria De Iorio, Gareth Ambler, and Rumana Z Omar
- Subjects
Male ,Statistics and Probability ,Elastic net regularization ,overfitting ,Epidemiology ,Computer science ,Biostatistics ,Overfitting ,Bayesian regularisation ,Machine learning ,computer.software_genre ,Logistic regression ,01 natural sciences ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Bias ,Lasso (statistics) ,Risk Factors ,Special Issue Paper ,Statistics ,Linear regression ,Prior probability ,Rare events ,Humans ,Computer Simulation ,030212 general & internal medicine ,0101 mathematics ,Penile Neoplasms ,Likelihood Functions ,Models, Statistical ,Special Issue Papers ,business.industry ,Bayes Theorem ,Regression analysis ,Prognosis ,rare events ,3. Good health ,shrinkage ,Logistic Models ,Data Interpretation, Statistical ,Regression Analysis ,Artificial intelligence ,business ,computer - Abstract
Risk prediction models are used to predict a clinical outcome for patients using a set of predictors. We focus on predicting low‐dimensional binary outcomes typically arising in epidemiology, health services and public health research where logistic regression is commonly used. When the number of events is small compared with the number of regression coefficients, model overfitting can be a serious problem. An overfitted model tends to demonstrate poor predictive accuracy when applied to new data. We review frequentist and Bayesian shrinkage methods that may alleviate overfitting by shrinking the regression coefficients towards zero (some methods can also provide more parsimonious models by omitting some predictors). We evaluated their predictive performance in comparison with maximum likelihood estimation using real and simulated data. The simulation study showed that maximum likelihood estimation tends to produce overfitted models with poor predictive performance in scenarios with few events, and penalised methods can offer improvement. Ridge regression performed well, except in scenarios with many noise predictors. Lasso performed better than ridge in scenarios with many noise predictors and worse in the presence of correlated predictors. Elastic net, a hybrid of the two, performed well in all scenarios. Adaptive lasso and smoothly clipped absolute deviation performed best in scenarios with many noise predictors; in other scenarios, their performance was inferior to that of ridge and lasso. Bayesian approaches performed well when the hyperparameters for the priors were chosen carefully. Their use may aid variable selection, and they can be easily extended to clustered‐data settings and to incorporate external information. © 2015 The Authors. Statistics in Medicine Published by JohnWiley & Sons Ltd.
- Published
- 2015
- Full Text
- View/download PDF
17. Mastering variation: variance components and personalised medicine
- Author
-
Stephen Senn
- Subjects
Statistics and Probability ,Epidemiology ,cross‐over trials ,Biostatistics ,01 natural sciences ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Forced Expiratory Volume ,n‐of‐1 trials ,Special Issue Paper ,Computer Graphics ,Humans ,Medicine ,Computer Simulation ,random effects ,030212 general & internal medicine ,Precision Medicine ,0101 mathematics ,Randomized Controlled Trials as Topic ,Tonsillectomy ,Analysis of Variance ,Clinical Trials as Topic ,Cross-Over Studies ,Special Issue Papers ,business.industry ,personalised medicine ,Precision medicine ,Response to treatment ,Asthma ,3. Good health ,Clinical trial ,Variation (linguistics) ,components of variation ,Variance components ,Element (criminal law) ,business ,Cognitive psychology ,Medical literature - Abstract
Various sources of variation in observed response in clinical trials and clinical practice are considered, and ways in which the corresponding components of variation might be estimated are discussed. Although the issues have been generally well‐covered in the statistical literature, they seem to be poorly understood in the medical literature and even the statistical literature occasionally shows some confusion. To increase understanding and communication, some simple graphical approaches to illustrating issues are proposed. It is also suggested that reducing variation in medical practice might make as big a contribution to improving health outcome as personalising its delivery according to the patient. It is concluded that the common belief that there is a strong personal element in response to treatment is not based on sound statistical evidence. © 2015 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
- Published
- 2015
- Full Text
- View/download PDF
18. Confidence distributions for treatment effects in clinical trials: Posteriors without priors.
- Author
-
Marschner, Ian C.
- Subjects
- *
CLINICAL trials , *TREATMENT effectiveness , *DISTRIBUTION (Probability theory) , *FREQUENTIST statistics , *BAYESIAN analysis - Abstract
An attractive feature of using a Bayesian analysis for a clinical trial is that knowledge and uncertainty about the treatment effect is summarized in a posterior probability distribution. Researchers often find probability statements about treatment effects highly intuitive and the fact that this is not accommodated in frequentist inference is a disadvantage. At the same time, the requirement to specify a prior distribution in order to obtain a posterior distribution is sometimes an artificial process that may introduce subjectivity or complexity into the analysis. This paper considers a compromise involving confidence distributions, which are probability distributions that summarize uncertainty about the treatment effect without the need for a prior distribution and in a way that is fully compatible with frequentist inference. The concept of a confidence distribution provides a posterior–like probability distribution that is distinct from, but exists in tandem with, the relative frequency interpretation of probability used in frequentist inference. Although they have been discussed for decades, confidence distributions are not well known among clinical trial statisticians and the goal of this paper is to discuss their use in analyzing treatment effects from randomized trials. As well as providing an introduction to confidence distributions, some illustrative examples relevant to clinical trials are presented, along with various case studies based on real clinical trials. It is recommended that trial statisticians consider presenting confidence distributions for treatment effects when reporting analyses of clinical trials. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Model misspecification and robust analysis for outcome‐dependent sampling designs under generalized linear models.
- Author
-
Maronge, Jacob M., Schildcrout, Jonathan S., and Rathouz, Paul J.
- Subjects
PARAMETRIC modeling - Abstract
Outcome‐dependent sampling (ODS) is a commonly used class of sampling designs to increase estimation efficiency in settings where response information (and possibly adjuster covariates) is available, but the exposure is expensive and/or cumbersome to collect. We focus on ODS within the context of a two‐phase study, where in Phase One the response and adjuster covariate information is collected on a large cohort that is representative of the target population, but the expensive exposure variable is not yet measured. In Phase Two, using response information from Phase One, we selectively oversample a subset of informative subjects in whom we collect expensive exposure information. Importantly, the Phase Two sample is no longer representative, and we must use ascertainment‐correcting analysis procedures for valid inferences. In this paper, we focus on likelihood‐based analysis procedures, particularly a conditional‐likelihood approach and a full‐likelihood approach. Whereas the full‐likelihood retains incomplete Phase One data for subjects not selected into Phase Two, the conditional‐likelihood explicitly conditions on Phase Two sample selection (ie, it is a "complete case" analysis procedure). These designs and analysis procedures are typically implemented assuming a known, parametric model for the response distribution. However, in this paper, we approach analyses implementing a novel semi‐parametric extension to generalized linear models (SPGLM) to develop likelihood‐based procedures with improved robustness to misspecification of distributional assumptions. We specifically focus on the common setting where standard GLM distributional assumptions are not satisfied (eg, misspecified mean/variance relationship). We aim to provide practical design guidance and flexible tools for practitioners in these settings. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. Response‐Adaptive Randomization Procedure in Clinical Trials with Surrogate Endpoints.
- Author
-
Gao, Jingya, Hu, Feifang, and Ma, Wei
- Subjects
- *
FALSE positive error , *ASYMPTOTIC normality , *INFERENTIAL statistics , *ERROR rates , *CLINICAL trials - Abstract
In clinical trials, subjects are usually recruited sequentially. According to the outcomes amassed thus far in a trial, the response‐adaptive randomization (RAR) design has been shown to be an advantageous treatment assignment procedure that skews the treatment allocation proportion to pre‐specified objectives, such as sending more patients to a more promising treatment. Unfortunately, there are circumstances under which very few data of the primary endpoints are collected in the recruitment period, such as circumstances relating to public health emergencies and chronic diseases, and RAR is thus difficult to apply in allocating treatments using available outcomes. To overcome this problem, if an informative surrogate endpoint can be acquired much earlier than the primary endpoint, the surrogate endpoint can be used as a substitute for the primary endpoint in the RAR procedure. In this paper, we propose an RAR procedure that relies only on surrogate endpoints. The validity of the statistical inference on the primary endpoint and the patient benefit of this approach are justified by both theory and simulation. Furthermore, different types of surrogate endpoint and primary endpoint are considered. The results reassure that RAR with surrogate endpoints can be a viable option in some cases for clinical trials when primary endpoints are unavailable for adaptation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. A Bayesian Approach to Modeling Variance of Intensive Longitudinal Biomarker Data as a Predictor of Health Outcomes.
- Author
-
Yu, Mingyan, Wu, Zhenke, Hicken, Margaret, and Elliott, Michael R.
- Subjects
- *
HEART beat , *BIOMARKERS , *INFORMATION sharing , *LONGITUDINAL method , *NUISANCES - Abstract
Intensive longitudinal biomarker data are increasingly common in scientific studies that seek temporally granular understanding of the role of behavioral and physiological factors in relation to outcomes of interest. Intensive longitudinal biomarker data, such as those obtained from wearable devices, are often obtained at a high frequency typically resulting in several hundred to thousand observations per individual measured over minutes, hours, or days. Often in longitudinal studies, the primary focus is on relating the means of biomarker trajectories to an outcome, and the variances are treated as nuisance parameters, although they may also be informative for the outcomes. In this paper, we propose a Bayesian hierarchical model to jointly model a cross‐sectional outcome and the intensive longitudinal biomarkers. To model the variability of biomarkers and deal with the high intensity of data, we develop subject‐level cubic B‐splines and allow the sharing of information across individuals for both the residual variability and the random effects variability. Then different levels of variability are extracted and incorporated into an outcome submodel for inferential and predictive purposes. We demonstrate the utility of the proposed model via an application involving bio‐monitoring of hertz‐level heart rate information from a study on social stress. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. A Partially Randomized Patient Preference, Sequential, Multiple‐Assignment, Randomized Trial Design Analyzed via Weighted and Replicated Frequentist and Bayesian Methods.
- Author
-
Wank, Marianthie, Medley, Sarah, Tamura, Roy N., Braun, Thomas M., and Kidwell, Kelley M.
- Subjects
- *
EXPERIMENTAL design , *PATIENT preferences , *REGRESSION analysis , *MARKOV chain Monte Carlo , *CLINICAL trials - Abstract
Results from randomized control trials (RCTs) may not be representative when individuals refuse to be randomized or are excluded for having a preference for which treatment they receive. If trial designs do not allow for participant treatment preferences, trials can suffer in accrual, adherence, retention, and external validity of results. Thus, there is interest surrounding clinical trial designs that incorporate participant treatment preferences. We propose a Partially Randomized, Patient Preference, Sequential, Multiple Assignment, Randomized Trial (PRPP‐SMART) which combines a Partially Randomized, Patient Preference (PRPP) design with a Sequential, Multiple Assignment, Randomized Trial (SMART) design. This novel PRPP‐SMART design is a multi‐stage clinical trial design where, at each stage, participants either receive their preferred treatment, or if they do not have a preferred treatment, they are randomized. This paper focuses on the clinical trial design for PRPP‐SMARTs and the development of Bayesian and frequentist weighted and replicated regression models (WRRMs) to analyze data from such trials. We propose a two‐stage PRPP‐SMART with binary end of stage outcomes and estimate the embedded dynamic treatment regimes (DTRs). Our WRRMs use data from both randomized and non‐randomized participants for efficient estimation of the DTR effects. We compare our method to a more traditional PRPP analysis which only considers participants randomized to treatment. Our Bayesian and frequentist methods produce more efficient DTR estimates with negligible bias despite the inclusion of non‐randomized participants in the analysis. The proposed PRPP‐SMART design and analytic method is a promising approach to incorporate participant treatment preferences into clinical trial design. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Statistical Inference for Box–Cox based Receiver Operating Characteristic Curves.
- Author
-
Bantis, Leonidas E., Brewer, Benjamin, Nakas, Christos T., and Reiser, Benjamin
- Subjects
- *
RECEIVER operating characteristic curves , *INFERENTIAL statistics , *ACCOUNTING methods , *SENSITIVITY & specificity (Statistics) , *DIAGNOSIS methods - Abstract
Receiver operating characteristic (ROC) curve analysis is widely used in evaluating the effectiveness of a diagnostic test/biomarker or classifier score. A parametric approach for statistical inference on ROC curves based on a Box–Cox transformation to normality has frequently been discussed in the literature. Many investigators have highlighted the difficulty of taking into account the variability of the estimated transformation parameter when carrying out such an analysis. This variability is often ignored and inferences are made by considering the estimated transformation parameter as fixed and known. In this paper, we will review the literature discussing the use of the Box–Cox transformation for ROC curves and the methodology for accounting for the estimation of the Box–Cox transformation parameter in the context of ROC analysis, and detail its application to a number of problems. We present a general framework for inference on any functional of interest, including common measures such as the AUC, the Youden index, and the sensitivity at a given specificity (and vice versa). We further developed a new R package (named 'rocbc') that carries out all discussed approaches and is available in CRAN. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Improving Survey Inference Using Administrative Records Without Releasing Individual‐Level Continuous Data.
- Author
-
Williams, Sharifa Z., Zou, Jungang, Liu, Yutao, Si, Yajuan, Galea, Sandro, and Chen, Qixuan
- Subjects
- *
INFERENTIAL statistics , *ESTIMATION bias , *STATISTICAL sampling , *NONRESPONSE (Statistics) , *MILITARY reserve forces - Abstract
Probability surveys are challenged by increasing nonresponse rates, resulting in biased statistical inference. Auxiliary information about populations can be used to reduce bias in estimation. Often continuous auxiliary variables in administrative records are first discretized before releasing to the public to avoid confidentiality breaches. This may weaken the utility of the administrative records in improving survey estimates, particularly when there is a strong relationship between continuous auxiliary information and the survey outcome. In this paper, we propose a two‐step strategy, where the confidential continuous auxiliary data in the population are first utilized to estimate the response propensity score of the survey sample by statistical agencies, which is then included in a modified population data for data users. In the second step, data users who do not have access to confidential continuous auxiliary data conduct predictive survey inference by including discretized continuous variables and the propensity score as predictors using splines in a Bayesian model. We show by simulation that the proposed method performs well, yielding more efficient estimates of population means with 95% credible intervals providing better coverage than alternative approaches. We illustrate the proposed method using the Ohio Army National Guard Mental Health Initiative (OHARNG‐MHI). The methods developed in this work are readily available in the R package AuxSurvey. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Nonparametric Estimation for Propensity Scores With Misclassified Treatments.
- Author
-
Chen, Li‐Pang
- Subjects
- *
MEASUREMENT errors , *NONPARAMETRIC estimation , *CAUSAL inference , *RANDOM forest algorithms , *TREATMENT effectiveness - Abstract
ABSTRACT In the framework of causal inference, average treatment effect (ATE) is one of crucial concerns. To estimate it, the propensity score based estimation method and its variants have been widely adopted. However, most existing methods were developed by assuming that binary treatments are precisely measured. In addition, propensity scores are usually formulated as parametric models with respect to confounders. However, in the presence of measurement error in binary treatments and nonlinear relationship between treatments and confounders, existing methods are no longer valid and may yield biased inference results if these features are ignored. In this paper, we first analytically examine the impact of estimation of ATE and derive biases for the estimator of ATE when treatments are contaminated with measurement error. After that, we develop a valid method to address binary treatments with misclassification. Given the corrected treatments, we adopt the random forest method to estimate the propensity score with nonlinear confounders accommodated and then derive the estimator of ATE. Asymptotic properties of the error‐eliminated estimator are established. Numerical studies are also conducted to assess the finite sample performance of the proposed estimator, and numerical results verify the importance of correcting for measurement error effects. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. New Quadratic Discriminant Analysis Algorithms for Correlated Audiometric Data.
- Author
-
Guo, Fuyu, Zucker, David M., Vaden, Kenneth I., Curhan, Sharon, Dubno, Judy R., and Wang, Molin
- Subjects
- *
DISCRIMINANT analysis , *HEARING impaired , *HEARING disorders , *PHENOTYPES , *LUNGS - Abstract
Paired organs like eyes, ears, and lungs in humans exhibit similarities, and data from these organs often display remarkable correlations. Accounting for these correlations could enhance classification models used in predicting disease phenotypes. To our knowledge, there is limited, if any, literature addressing this topic, and existing methods do not exploit such correlations. For example, the conventional approach treats each ear as an independent observation when predicting audiometric phenotypes and is agnostic about the correlation of data from the two ears of the same person. This approach may lead to information loss and reduce the model performance. In response to this gap, particularly in the context of audiometric phenotype prediction, this paper proposes new quadratic discriminant analysis (QDA) algorithms that appropriately deal with the dependence between ears. We propose two‐stage analysis strategies: (1) conducting data transformations to reduce data dimensionality before applying QDA; and (2) developing new QDA algorithms to partially utilize the dependence between phenotypes of two ears. We conducted simulation studies to compare different transformation methods and to assess the performance of different QDA algorithms. The empirical results suggested that the transformation may only be beneficial when the sample size is relatively small. Moreover, our proposed new QDA algorithms performed better than the conventional approach in both person‐level and ear‐level accuracy. As an illustration, we applied them to audiometric data from the Medical University of South Carolina Longitudinal Cohort Study of Age‐related Hearing Loss. In addition, we developed an R package, PairQDA, to implement the proposed algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Advancing Interpretable Regression Analysis for Binary Data: A Novel Distributed Algorithm Approach.
- Author
-
Tong, Jiayi, Li, Lu, Reps, Jenna Marie, Lorman, Vitaly, Jing, Naimin, Edmondson, Mackenzie, Lou, Xiwei, Jhaveri, Ravi, Kelleher, Kelly J., Pajor, Nathan M., Forrest, Christopher B., Bian, Jiang, Chu, Haitao, and Chen, Yong
- Subjects
- *
MACHINE learning , *POISSON regression , *DISTRIBUTED algorithms , *ACADEMIC medical centers , *REGRESSION analysis - Abstract
Sparse data bias, where there is a lack of sufficient cases, is a common problem in data analysis, particularly when studying rare binary outcomes. Although a two‐step meta‐analysis approach may be used to lessen the bias by combining the summary statistics to increase the number of cases from multiple studies, this method does not completely eliminate bias in effect estimation. In this paper, we propose a one‐shot distributed algorithm for estimating relative risk using a modified Poisson regression for binary data, named ODAP‐B. We evaluate the performance of our method through both simulation studies and real‐world case analyses of postacute sequelae of SARS‐CoV‐2 infection in children using data from 184 501 children across eight national academic medical centers. Compared with the meta‐analysis method, our method provides closer estimates of the relative risk for all outcomes considered including syndromic and systemic outcomes. Our method is communication‐efficient and privacy‐preserving, requiring only aggregated data to obtain relatively unbiased effect estimates compared with two‐step meta‐analysis methods. Overall, ODAP‐B is an effective distributed learning algorithm for Poisson regression to study rare binary outcomes. The method provides inference on adjusted relative risk with a robust variance estimator. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Asymptotic Properties of Matthews Correlation Coefficient.
- Author
-
Itaya, Yuki, Tamura, Jun, Hayashi, Kenichi, and Yamamoto, Kouji
- Subjects
- *
STATISTICAL reliability , *STATISTICAL significance , *INFERENTIAL statistics , *STATISTICAL correlation , *EVIDENCE gaps - Abstract
ABSTRACT Evaluating classifications is crucial in statistics and machine learning, as it influences decision‐making across various fields, such as patient prognosis and therapy in critical conditions. The Matthews correlation coefficient (MCC), also known as the phi coefficient, is recognized as a performance metric with high reliability, offering a balanced measurement even in the presence of class imbalances. Despite its importance, there remains a notable lack of comprehensive research on the statistical inference of MCC. This deficiency often leads to studies merely validating and comparing MCC point estimates—a practice that, while common, overlooks the statistical significance and reliability of results. Addressing this research gap, our paper introduces and evaluates several methods to construct asymptotic confidence intervals for the single MCC and the differences between MCCs in paired designs. Through simulations across various scenarios, we evaluate the finite‐sample behavior of these methods and compare their performances. Furthermore, through real data analysis, we illustrate the potential utility of our findings in comparing binary classifiers, highlighting the possible contributions of our research in this field. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. A Novel Bayesian Spatio‐Temporal Surveillance Metric to Predict Emerging Infectious Disease Areas of High Disease Risk.
- Author
-
Kim, Joanne, Lawson, Andrew B., Neelon, Brian, Korte, Jeffrey E., Eberth, Jan M., and Chowell, Gerardo
- Subjects
- *
PUBLIC health surveillance , *DISEASE outbreaks , *COMMUNICABLE diseases , *RESOURCE allocation , *EMERGING infectious diseases , *NEIGHBORHOODS - Abstract
Identification of areas of high disease risk has been one of the top goals for infectious disease public health surveillance. Accurate prediction of these regions leads to effective resource allocation and faster intervention. This paper proposes a novel prediction surveillance metric based on a Bayesian spatio‐temporal model for infectious disease outbreaks. Exceedance probability, which has been commonly used for cluster detection in statistical epidemiology, was extended to predict areas of high risk. The proposed metric consists of three components: the area's risk profile, temporal risk trend, and spatial neighborhood influence. We also introduce a weighting scheme to balance these three components, which accommodates the characteristics of the infectious disease outbreak, spatial properties, and disease trends. Thorough simulation studies were conducted to identify the optimal weighting scheme and evaluate the performance of the proposed prediction surveillance metric. Results indicate that the area's own risk and the neighborhood influence play an important role in making a highly sensitive metric, and the risk trend term is important for the specificity and accuracy of prediction. The proposed prediction metric was applied to the COVID‐19 case data of South Carolina from March 12, 2020, and the subsequent 30 weeks of data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Two-tailed significance tests for 2 × 2 contingency tables: What is the alternative?
- Author
-
Prescott, Robin J
- Subjects
STATISTICAL hypothesis testing ,CONTINGENCY tables ,FISHER exact test ,CHI-squared test ,NULL hypothesis ,COMPUTER simulation ,PROBABILITY theory ,STATISTICS ,DATA analysis - Abstract
Two-tailed significance testing for 2 × 2 contingency tables has remained controversial. Within the medical literature, different tests are used in different papers and that choice may decide whether findings are adjudged to be significant or nonsignificant; a state of affairs that is clearly undesirable. In this paper, it is argued that a part of the controversy is due to a failure to recognise that there are two possible alternative hypotheses to the Null. It is further argued that, while one alternative hypothesis can lead to tests with greater power, the other choice is more applicable in medical research. That leads to the recommendation that, within medical research, 2 × 2 tables should be tested using double the one-tailed exact probability from Fisher's exact test or, as an approximation, the chi-squared test with Yates' correction for continuity. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
31. Commentary: Methods for calculating growth trajectories and constructing growth centiles.
- Author
-
Cole, T. J.
- Subjects
PERCENTILES - Abstract
This commentary rounds off a collection of papers focusing on statistical methods for analysing growth data. In two papers, Anderson and colleagues discuss growth trajectory models in early life, using data on height and weight from the HBGDki initiative, while two papers from Ohuma and Altman review methods for centile construction, with data from the INTERGROWTH-21st project used to provide worked examples of centiles for birthweight and fetal head circumference. Anderson et al focus on four growth trajectory models: quadratic Laird-Ware, SITAR, brokenstick, and FACE, where the latter two fit better than the former two applied to length data in individuals. On this basis, they recommend brokenstick and FACE for future work. However, they do not discuss the timescale on which the growth models assess growth faltering nor the relevance of this timescale to later health outcome. Models that best detect short-term fluctuations in growth (brokenstick and FACE) may not necessarily be best at predicting later outcome. It is premature to exclude the quadratic Laird-Ware or SITAR models, which give a parsimonious summary of growth in individuals over a longer timescale. Ohuma and Altman highlight the poor quality of reporting in fetal centile studies, and they provide recommendations for good practice. Their birthweight centiles example illustrates both the power of the GAMLSS software and its capacity for misuse. The longitudinal fetal head circumference centiles are biased such that 5% of infants are below the 3rd centile and 5% above the 97th . [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
32. Three‐phase generalized raking and multiple imputation estimators to address error‐prone data.
- Author
-
Amorim, Gustavo, Tao, Ran, Lotspeich, Sarah, Shaw, Pamela A., Lumley, Thomas, Patel, Rena C., and Shepherd, Bryan E.
- Subjects
ELECTRONIC health records ,HIV-positive women ,TELEPHONE interviewing - Abstract
Validation studies are often used to obtain more reliable information in settings with error‐prone data. Validated data on a subsample of subjects can be used together with error‐prone data on all subjects to improve estimation. In practice, more than one round of data validation may be required, and direct application of standard approaches for combining validation data into analyses may lead to inefficient estimators since the information available from intermediate validation steps is only partially considered or even completely ignored. In this paper, we present two novel extensions of multiple imputation and generalized raking estimators that make full use of all available data. We show through simulations that incorporating information from intermediate steps can lead to substantial gains in efficiency. This work is motivated by and illustrated in a study of contraceptive effectiveness among 83 671 women living with HIV, whose data were originally extracted from electronic medical records, of whom 4732 had their charts reviewed, and a subsequent 1210 also had a telephone interview to validate key study variables. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. A frequentist design for basket trials using adaptive lasso.
- Author
-
Kanapka, Lauren and Ivanova, Anastasia
- Subjects
FALSE positive error ,BASKETS ,ERROR rates ,DRUG development - Abstract
A basket trial aims to expedite the drug development process by evaluating a new therapy in multiple populations within the same clinical trial. Each population, referred to as a "basket", can be defined by disease type, biomarkers, or other patient characteristics. The objective of a basket trial is to identify the subset of baskets for which the new therapy shows promise. The conventional approach would be to analyze each of the baskets independently. Alternatively, several Bayesian dynamic borrowing methods have been proposed that share data across baskets when responses appear similar. These methods can achieve higher power than independent testing in exchange for a risk of some inflation in the type 1 error rate. In this paper we propose a frequentist approach to dynamic borrowing for basket trials using adaptive lasso. Through simulation studies we demonstrate adaptive lasso can achieve similar power and type 1 error to the existing Bayesian methods. The proposed approach has the benefit of being easier to implement and faster than existing methods. In addition, the adaptive lasso approach is very flexible: it can be extended to basket trials with any number of treatment arms and any type of endpoint. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Testing latent classes in gut microbiome data using generalized Poisson regression models.
- Author
-
Qiao, Xinhui, He, Hua, Sun, Liuquan, Bai, Shuo, and Ye, Peng
- Subjects
POISSON regression ,GUT microbiome ,REGRESSION analysis ,HUMAN microbiota ,ASYMPTOTIC distribution - Abstract
Human microbiome research has gained increasing importance due to its critical roles in comprehending human health and disease. Within the realm of microbiome research, the data generated often involves operational taxonomic unit counts, which can frequently present challenges such as over‐dispersion and zero‐inflation. To address dispersion‐related concerns, the generalized Poisson model offers a flexible solution, effectively handling data characterized by over‐dispersion, equi‐dispersion, and under‐dispersion. Furthermore, the realm of zero‐inflated generalized Poisson models provides a strategic avenue to simultaneously tackle both over‐dispersion and zero‐inflation. The phenomenon of zero‐inflation frequently stems from the heterogeneous nature of study populations. It emerges when specific microbial taxa fail to thrive in the microbial community of certain subjects, consequently resulting in a consistent count of zeros for these individuals. This subset of subjects represents a latent class, where their zeros originate from the genuine absence of the microbial taxa. In this paper, we introduce a novel testing methodology designed to uncover such latent classes within generalized Poisson regression models. We establish a closed‐form test statistic and deduce its asymptotic distribution based on estimating equations. To assess its efficacy, we conduct an extensive array of simulation studies, and further apply the test to detect latent classes in human gut microbiome data from the Bogalusa Heart Study. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Local false discovery rate estimation with competition‐based procedures for variable selection.
- Author
-
Sun, Xiaoya and Fu, Yan
- Subjects
FALSE discovery rate ,COVID-19 ,ERROR rates ,DRUG resistance - Abstract
Multiple hypothesis testing has been widely applied to problems dealing with high‐dimensional data, for example, the selection of important variables or features from a large number of candidates while controlling the error rate. The most prevailing measure of error rate used in multiple hypothesis testing is the false discovery rate (FDR). In recent years, the local false discovery rate (fdr) has drawn much attention, due to its advantage of accessing the confidence of individual hypotheses. However, most methods estimate fdr through P$$ P $$‐values or statistics with known null distributions, which are sometimes unavailable or unreliable. Adopting the innovative methodology of competition‐based procedures, for example, the knockoff filter, this paper proposes a new approach, named TDfdr, to fdr estimation, which is free of P$$ P $$‐values or known null distributions. Extensive simulation studies demonstrate that TDfdr can accurately estimate the fdr with two competition‐based procedures. We applied the TDfdr method to two real biomedical tasks. One is to identify significantly differentially expressed proteins related to the COVID‐19 disease, and the other is to detect mutations in the genotypes of HIV‐1 that are associated with drug resistance. Higher discovery power was observed compared to existing popular methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. On semiparametric accelerated failure time models with time‐varying covariates: A maximum penalised likelihood estimation.
- Author
-
Ma, Ding, Ma, Jun, and Graham, Petra L.
- Subjects
MAXIMUM likelihood statistics ,PROPORTIONAL hazards models ,MOTOR neuron diseases ,CONSTRAINED optimization ,LOG-linear models - Abstract
The accelerated failure time (AFT) model offers an important and useful alternative to the conventional Cox proportional hazards model, particularly when the proportional hazards assumption for a Cox model is violated. Since an AFT model is basically a log‐linear model, meaningful interpretations of covariate effects on failure times can be made directly. However, estimation of a semiparametric AFT model imposes computational challenges even when it only has time‐fixed covariates, and the situation becomes much more complicated when time‐varying covariates are included. In this paper, we propose a penalised likelihood approach to estimate the semiparametric AFT model with right‐censored failure time, where both time‐fixed and time‐varying covariates are permitted. We adopt the Gaussian basis functions to construct a smooth approximation to the nonparametric baseline hazard. This model fitting method requires a constrained optimisation approach. A comprehensive simulation study is conducted to demonstrate the performance of the proposed method. An application of our method to a motor neuron disease data set is provided. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
37. Joint clustering multiple longitudinal features: A comparison of methods and software packages with practical guidance.
- Author
-
Lu, Zihang, Ahmadiankalati, Mojtaba, and Tan, Zhiwen
- Subjects
INTEGRATED software ,CLUSTER analysis (Statistics) ,RESEARCH personnel ,K-means clustering - Abstract
Clustering longitudinal features is a common goal in medical studies to identify distinct disease developmental trajectories. Compared to clustering a single longitudinal feature, integrating multiple longitudinal features allows additional information to be incorporated into the clustering process, which may reveal co‐existing longitudinal patterns and generate deeper biological insight. Despite its increasing importance and popularity, there is limited practical guidance for implementing cluster analysis approaches for multiple longitudinal features and evaluating their comparative performance in medical datasets. In this paper, we provide an overview of several commonly used approaches to clustering multiple longitudinal features, with an emphasis on application and implementation through R software. These methods can be broadly categorized into two categories, namely model‐based (including frequentist and Bayesian) approaches and algorithm‐based approaches. To evaluate their performance, we compare these approaches using real‐life and simulated datasets. These results provide practical guidance to applied researchers who are interested in applying these approaches for clustering multiple longitudinal features. Recommendations for applied researchers and suggestions for future research in this area are also discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
38. Assessable and interpretable sensitivity analysis in the pattern graph framework for nonignorable missingness mechanisms.
- Author
-
Zamanian, Alireza, Ahmidi, Narges, and Drton, Mathias
- Subjects
SENSITIVITY analysis ,MISSING data (Statistics) ,MEDICAL research - Abstract
The pattern graph framework solves a wide range of missing data problems with nonignorable mechanisms. However, it faces two challenges of assessability and interpretability, particularly important in safety‐critical problems such as clinical diagnosis: (i) How can one assess the validity of the framework's a priori assumption and make necessary adjustments to accommodate known information about the problem? (ii) How can one interpret the process of exponential tilting used for sensitivity analysis in the pattern graph framework and choose the tilt perturbations based on meaningful real‐world quantities? In this paper, we introduce Informed Sensitivity Analysis, an extension of the pattern graph framework that enables us to incorporate substantive knowledge about the missingness mechanism into the pattern graph framework. Our extension allows us to examine the validity of assumptions underlying pattern graphs and interpret sensitivity analysis results in terms of realistic problem characteristics. We apply our method to a prevalent nonignorable missing data scenario in clinical research. We validate and compare our method's results of our method with a number of widely‐used missing data methods, including Unweighted CCA, KNN Imputer, MICE, and MissForest. The validation is done using both boot‐strapped simulated experiments as well as real‐world clinical observations in the MIMIC‐III public dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. Semiparametric normal transformation joint model of multivariate longitudinal and bivariate time‐to‐event data.
- Author
-
Tang, An‐Ming, Peng, Cheng, and Tang, Niansheng
- Subjects
GIBBS sampling ,HAZARD function (Statistics) ,RANDOM variables ,BREAST cancer ,CLINICAL trials - Abstract
Joint models for longitudinal and survival data (JMLSs) are widely used to investigate the relationship between longitudinal and survival data in clinical trials in recent years. But, the existing studies mainly focus on independent survival data. In many clinical trials, survival data may be bivariately correlated. To this end, this paper proposes a novel JMLS accommodating multivariate longitudinal and bivariate correlated time‐to‐event data. Nonparametric marginal survival hazard functions are transformed to bivariate normal random variables. Bayesian penalized splines are employed to approximate unknown baseline hazard functions. Incorporating the Metropolis‐Hastings algorithm into the Gibbs sampler, we develop a Bayesian adaptive Lasso method to simultaneously estimate parameters and baseline hazard functions, and select important predictors in the considered JMLS. Simulation studies and an example taken from the International Breast Cancer Study Group are used to illustrate the proposed methodologies. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. Ball divergence for the equality test of crossing survival curves.
- Author
-
You, Na, He, Xueyi, Dai, Hongsheng, and Wang, Xueqin
- Subjects
LOG-rank test ,HAZARD function (Statistics) ,CENSORING (Statistics) - Abstract
It is a very common problem to test survival equality using the right‐censored time‐to‐event data in clinical research. Although the log‐rank test is popularly used in various studies, it may become insensitive when the proportional hazards assumption is violated. As follows, there have a variety of statistical methods being proposed to identify the discrepancy between crossing survival curves or hazard functions. The omnibus tests against general alternatives are usually preferred due to their wide applicability to complicated scenarios in real applications. In this paper, we propose two novel statistics to estimate the ball divergence using the right‐censored survival data, and then implement them in the equality test on survival time in two independent groups. The simulation analysis demonstrates their efficiency in identifying the survival discrepancy. Compared to the existing methods, our proposed methods present higher power in situations with complex distributions, especially when there is a scale shift between groups. Real examples illustrate its advantage in practical applications. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. Joint semiparametric kernel network regression.
- Author
-
Kim, Byung‐Jun and Kim, Inyoung
- Subjects
SPARSE matrices ,PARAMETRIC modeling ,SAMPLE size (Statistics) ,DATA visualization ,DATA analysis - Abstract
Variable selection and graphical modeling play essential roles in highly correlated and high‐dimensional (HCHD) data analysis. Variable selection methods have been developed under both parametric and nonparametric model settings. However, variable selection for nonadditive, nonparametric regression with high‐dimensional variables is challenging due to complications in modeling unknown dependence structures among HCHD variables. Gaussian graphical models are a popular and useful tool for investigating the conditional dependence between variables via estimating sparse precision matrices. For a given class of interest, the estimated precision matrices can be mapped onto networks for visualization. However, the limitation of Gaussian graphical models is that they are only applicable to discretized response variables and for the case when plog(p)≪n$$ p\log (p)\ll n $$, where p$$ p $$ is the number of variables and n$$ n $$ is the sample size. They are necessary to develop a joint method for variable selection and graphical modeling. To the best of our knowledge, the methods for simultaneously selecting variable selection and estimating networks among variables in the semiparametric regression settings are quite limited. Hence, in this paper, we develop a joint semiparametric kernel network regression method to solve this limitation and to provide a connection between them. Our approach is a unified and integrated method that can simultaneously identify important variables and build a network among those variables. We developed our approach under a semiparametric kernel machine regression framework, which can allow for nonlinear or nonadditive associations and complicated interactions among the variables. The advantages of our approach are that it can (1) simultaneously select variables and build a network among HCHD variables under a regression setting; (2) model unknown and complicated interactions among the variables and estimate the network among these variables; (3) allow for any form of semiparametric model, including non‐additive, nonparametric model; and (4) provide an interpretable network that considers important variables and a response variable. We demonstrate our approach using a simulation study and real application on genetic pathway‐based analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Generalized additive models to analyze nonlinear trends in biomedical longitudinal data using R: Beyond repeated measures ANOVA and linear mixed models.
- Author
-
Mundo, Ariel I., Tipton, John R., and Muldoon, Timothy J.
- Abstract
In biomedical research, the outcome of longitudinal studies has been traditionally analyzed using the repeated measures analysis of variance (rm-ANOVA) or more recently, linear mixed models (LMEMs). Although LMEMs are less restrictive than rm-ANOVA as they can work with unbalanced data and non-constant correlation between observations, both methodologies assume a linear trend in the measured response. It is common in biomedical research that the true trend response is nonlinear and in these cases the linearity assumption of rm-ANOVA and LMEMs can lead to biased estimates and unreliable inference. In contrast, GAMs relax the linearity assumption of rm-ANOVA and LMEMs and allow the data to determine the fit of the model while also permitting incomplete observations and different correlation structures. Therefore, GAMs present an excellent choice to analyze longitudinal data with non-linear trends in the context of biomedical research. This paper summarizes the limitations of rm-ANOVA and LMEMs and uses simulated data to visually show how both methods produce biased estimates when used on data with non-linear trends. We present the basic theory of GAMs and using reported trends of oxygen saturation in tumors, we simulate example longitudinal data (2 treatment groups, 10 subjects per group, 5 repeated measures for each group) to demonstrate their implementation in R. We also show that GAMs are able to produce estimates with non-linear trends even when incomplete observations exist (with 40% of the simulated observations missing). To make this work reproducible, the code and data used in this paper are available at: https://github.com/aimundo/GAMs-biomedical-research. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
43. Comments on "Tutorial on statistical considerations on subgroup analysis in confirmatory clinical trials".
- Author
-
Zhang, Chuanwu, Mayo, Matthew S., and Gajewski, Byron J.
- Abstract
This paper is the letter to the editor regarding several comments on 'Tutorial on statistical considerations on subgroup analysis in confirmatory clinical trials.' [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
44. Dynamic path analysis for exploring treatment effect mediation processes in clinical trials with time‐to‐event endpoints.
- Author
-
Kormaksson, Matthias, Lange, Markus Reiner, Demanse, David, Strohmaier, Susanne, Duan, Jiawei, Xie, Qing, Carbini, Mariana, Bossen, Claudia, Guettner, Achim, and Maniero, Antonella
- Subjects
- *
PATH analysis (Statistics) , *TREATMENT effectiveness , *SURVIVAL rate , *PROGNOSIS , *DRUG development - Abstract
Why does a beneficial treatment effect on a longitudinal biomarker not translate into overall treatment benefit on survival, when the biomarker is in fact a prognostic factor of survival? In a recent exploratory data analysis in oncology, we were faced with this seemingly paradoxical result. To address this problem, we applied a theoretically principled methodology called dynamic path analysis, which allows us to perform mediation analysis with a longitudinal mediator and survival outcome. The aim of the analysis is to decompose the total treatment effect into a direct treatment effect and an indirect treatment effect mediated through a carefully constructed mediation path. The dynamic nature of the underlying methodology enables us to describe how these effects evolve over time, which can add to the mechanistic understanding of the underlying processes. In this paper, we present a detailed description of the dynamic path analysis framework and illustrate its application to survival mediation analysis using simulated and real data. The use case analysis provides clarity on the specific exploratory question of interest while the methodology generalizes to a wide range of applications in drug development where time‐to‐event is the primary clinical outcome of interest. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Evaluating individualized treatment effect predictions: A model‐based perspective on discrimination and calibration assessment.
- Author
-
Hoogland, J., Efthimiou, O., Nguyen, T. L., and Debray, T. P. A.
- Subjects
- *
ISCHEMIC stroke , *TREATMENT effectiveness , *PREDICTION models , *CALIBRATION , *MODEL validation - Abstract
In recent years, there has been a growing interest in the prediction of individualized treatment effects. While there is a rapidly growing literature on the development of such models, there is little literature on the evaluation of their performance. In this paper, we aim to facilitate the validation of prediction models for individualized treatment effects. The estimands of interest are defined based on the potential outcomes framework, which facilitates a comparison of existing and novel measures. In particular, we examine existing measures of discrimination for benefit (variations of the c‐for‐benefit), and propose model‐based extensions to the treatment effect setting for discrimination and calibration metrics that have a strong basis in outcome risk prediction. The main focus is on randomized trial data with binary endpoints and on models that provide individualized treatment effect predictions and potential outcome predictions. We use simulated data to provide insight into the characteristics of the examined discrimination and calibration statistics under consideration, and further illustrate all methods in a trial of acute ischemic stroke treatment. The results show that the proposed model‐based statistics had the best characteristics in terms of bias and accuracy. While resampling methods adjusted for the optimism of performance estimates in the development data, they had a high variance across replications that limited their accuracy. Therefore, individualized treatment effect models are best validated in independent data. To aid implementation, a software implementation of the proposed methods was made available in R. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Modern approaches for evaluating treatment effect heterogeneity from clinical trials and observational data.
- Author
-
Lipkovich, Ilya, Svensson, David, Ratitch, Bohdana, and Dmitrienko, Alex
- Subjects
- *
TREATMENT effect heterogeneity , *CLINICAL trials , *EVALUATION methodology , *SCIENTIFIC observation - Abstract
In this paper, we review recent advances in statistical methods for the evaluation of the heterogeneity of treatment effects (HTE), including subgroup identification and estimation of individualized treatment regimens, from randomized clinical trials and observational studies. We identify several types of approaches using the features introduced in Lipkovich et al (Stat Med 2017;36: 136‐196) that distinguish the recommended principled methods from basic methods for HTE evaluation that typically rely on rules of thumb and general guidelines (the methods are often referred to as common practices). We discuss the advantages and disadvantages of various principled methods as well as common measures for evaluating their performance. We use simulated data and a case study based on a historical clinical trial to illustrate several new approaches to HTE evaluation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Validation of predicted individual treatment effects in out of sample respondents.
- Author
-
Kuhlemeier, Alena, Jaki, Thomas, Witkiewitz, Katie, Stuart, Elizabeth A., and Van Horn, M. Lee
- Subjects
- *
ALCOHOLISM , *INDIVIDUALIZED medicine , *CLINICAL trials , *DRUG therapy , *NUISANCES - Abstract
Personalized medicine promises the ability to improve patient outcomes by tailoring treatment recommendations to the likelihood that any given patient will respond well to a given treatment. It is important that predictions of treatment response be validated and replicated in independent data to support their use in clinical practice. In this paper, we propose and test an approach for validating predictions of individual treatment effects with continuous outcomes across samples that uses matching in a test (validation) sample to match individuals in the treatment and control arms based on their predicted treatment response and their predicted response under control. To examine the proposed validation approach, we conducted simulations where test data is generated from either an identical, similar, or unrelated process to the training data. We also examined the impact of nuisance variables. To demonstrate the use of this validation procedure in the context of predicting individual treatment effects in the treatment of alcohol use disorder, we apply our validation procedure using data from a clinical trial of combined behavioral and pharmacotherapy treatments. We find that the validation algorithm accurately confirms validation and lack of validation, and also provides insights into cases where test data were generated under similar, but not identical conditions. We also show that the presence of nuisance variables detrimentally impacts algorithm performance, which can be partially reduced though the use of variable selection methods. An advantage of the approach is that it can be widely applied to different predictive methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Identification and estimation of causal effects in the presence of confounded principal strata.
- Author
-
Luo, Shanshan, Li, Wei, Miao, Wang, and He, Yangbo
- Subjects
- *
CAUSAL inference , *SCIENTIFIC observation , *LEUKEMIA - Abstract
Principal stratification has become a popular tool to address a broad class of causal inference questions, particularly in dealing with non‐compliance and truncation by death problems. The causal effects within principal strata, which are determined by joint potential values of the intermediate variable, also known as the principal causal effects, are often of interest in these studies. The analysis of principal causal effects from observational studies mostly relies on the ignorability assumption of treatment assignment, which requires practitioners to accurately measure as many covariates as possible so that all potential sources of confounders are captured. However, in practice, collecting all potential confounding factors can be challenging and costly, rendering the ignorability assumption questionable. In this paper, we consider the identification and estimation of causal effects when treatment and principal stratification are confounded by unmeasured confounding. Specifically, we establish the nonparametric identification of principal causal effects using a pair of negative controls to mitigate unmeasured confounding, requiring they have no direct effect on the outcome variable. We also provide an estimation method for principal causal effects. Extensive simulations and a leukemia study are employed for illustration. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. Using instruments for selection to adjust for selection bias in Mendelian randomization.
- Author
-
Gkatzionis, Apostolos, Tchetgen Tchetgen, Eric J., Heron, Jon, Northstone, Kate, and Tilling, Kate
- Subjects
- *
MISSING data (Statistics) , *LEAST squares , *BODY mass index , *GENETIC variation , *SCIENTIFIC observation - Abstract
Selection bias is a common concern in epidemiologic studies. In the literature, selection bias is often viewed as a missing data problem. Popular approaches to adjust for bias due to missing data, such as inverse probability weighting, rely on the assumption that data are missing at random and can yield biased results if this assumption is violated. In observational studies with outcome data missing not at random, Heckman's sample selection model can be used to adjust for bias due to missing data. In this paper, we review Heckman's method and a similar approach proposed by Tchetgen Tchetgen and Wirth (2017). We then discuss how to apply these methods to Mendelian randomization analyses using individual‐level data, with missing data for either the exposure or outcome or both. We explore whether genetic variants associated with participation can be used as instruments for selection. We then describe how to obtain missingness‐adjusted Wald ratio, two‐stage least squares and inverse variance weighted estimates. The two methods are evaluated and compared in simulations, with results suggesting that they can both mitigate selection bias but may yield parameter estimates with large standard errors in some settings. In an illustrative real‐data application, we investigate the effects of body mass index on smoking using data from the Avon Longitudinal Study of Parents and Children. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. BHAFT: Bayesian heredity‐constrained accelerated failure time models for detecting gene‐environment interactions in survival analysis.
- Author
-
Sun, Na, Chu, Jiadong, He, Qida, Wang, Yu, Han, Qiang, Yi, Nengjun, Zhang, Ruyang, and Shen, Yueping
- Subjects
- *
SURVIVAL rate , *SURVIVAL analysis (Biometry) , *ETIOLOGY of diseases , *HORSESHOES , *LUNGS - Abstract
In addition to considering the main effects, understanding gene‐environment (G × E) interactions is imperative for determining the etiology of diseases and the factors that affect their prognosis. In the existing statistical framework for censored survival outcomes, there are several challenges in detecting G × E interactions, such as handling high‐dimensional omics data, diverse environmental factors, and algorithmic complications in survival analysis. The effect heredity principle has widely been used in studies involving interaction identification because it incorporates the dependence of the main and interaction effects. However, Bayesian survival models that incorporate the assumption of this principle have not been developed. Therefore, we propose Bayesian heredity‐constrained accelerated failure time (BHAFT) models for identifying main and interaction (M‐I) effects with novel spike‐and‐slab or regularized horseshoe priors to incorporate the assumption of effect heredity principle. The R package rstan was used to fit the proposed models. Extensive simulations demonstrated that BHAFT models had outperformed other existing models in terms of signal identification, coefficient estimation, and prognosis prediction. Biologically plausible G × E interactions associated with the prognosis of lung adenocarcinoma were identified using our proposed model. Notably, BHAFT models incorporating the effect heredity principle could identify both main and interaction effects, which are highly useful in exploring G × E interactions in high‐dimensional survival analysis. The code and data used in our paper are available at https://github.com/SunNa‐bayesian/BHAFT. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.