631 results
Search Results
2. Discussion of papers of session 2: Incomplete, multivariate, non-linear
- Author
-
Ralph B. D'Agostino
- Subjects
Statistics and Probability ,Multivariate statistics ,Information retrieval ,Epidemiology ,Session (computer science) ,Data mining ,computer.software_genre ,computer ,Mathematics - Published
- 1992
- Full Text
- View/download PDF
3. Multivariate analysis of variance and repeated measures, a practical approach for behavioural scientists. D. J. Hand and C. C. Taylor, Chapman & Hall, London, 1987, No. of. pages: xiv + 262. Price: £25.00 (hard), £13.95 (paper)
- Author
-
Thomas A. Louis
- Subjects
Statistics and Probability ,Multivariate analysis of variance ,Epidemiology ,Statistics ,Econometrics ,Repeated measures design ,Mathematics - Published
- 1989
- Full Text
- View/download PDF
4. Two‐stage meta‐analysis of survival data from individual participants using percentile ratios
- Author
-
Fotios Siannis, Jessica K. Barrett, Jayne F. Tierney, Julian P T Higgins, and Vern Farewell
- Subjects
Statistics and Probability ,Percentile ,Epidemiology ,Kaplan-Meier Estimate ,individual patient data ,01 natural sciences ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Meta-Analysis as Topic ,Statistics ,Odds Ratio ,Humans ,Computer Simulation ,030212 general & internal medicine ,0101 mathematics ,Survival analysis ,Probability ,Proportional Hazards Models ,Mathematics ,Postoperative Care ,Analysis of Variance ,Special Issue Papers ,Proportional hazards model ,Hazard ratio ,Glioma ,Odds ratio ,Survival Analysis ,Medical statistics ,3. Good health ,meta-analysis ,survival data ,Logistic Models ,Treatment Outcome ,Meta-analysis - Abstract
Methods for individual participant data meta-analysis of survival outcomes commonly focus on the hazard ratio as a measure of treatment effect. Recently, Siannis et al. (2010, Statistics in Medicine 29:3030–3045) proposed the use of percentile ratios as an alternative to hazard ratios. We describe a novel two-stage method for the meta-analysis of percentile ratios that avoids distributional assumptions at the study level. Copyright © 2012 John Wiley & Sons, Ltd.
- Published
- 2012
- Full Text
- View/download PDF
5. Estimate of standard deviation for a log-transformed variable using arithmetic means and standard deviations.
- Author
-
Quan H and Zhang J
- Subjects
- Confidence Intervals, Mathematics, Models, Statistical, Statistics as Topic methods
- Abstract
Analyses of study variables are frequently based on log transformations. To calculate the power for detecting the between-treatment difference in the log scale, we need an estimate of the standard deviation of the log-transformed variable. However, in many situations a literature search only provides the arithmetic means and the corresponding standard deviations. Without individual log-transformed data to directly calculate the sample standard deviation, we need alternative methods to estimate it. This paper presents methods for estimating and constructing confidence intervals for the standard deviation of a log-transformed variable given the mean and standard deviation of the untransformed variable. It also presents methods for estimating the standard deviation of change from baseline in the log scale given the means and standard deviations of the untransformed baseline value, on-treatment value and change from baseline. Simulations and examples are provided to assess the performance of these estimates., (Copyright 2003 John Wiley & Sons, Ltd.)
- Published
- 2003
- Full Text
- View/download PDF
6. Bruno de Finetti: the mathematician, the statistician, the economist, the forerunner.
- Author
-
Rossi C
- Subjects
- Economics history, History, 20th Century, Italy, Mathematics history, Statistics as Topic history
- Abstract
Bruno de Finetti is possibly the best known Italian applied mathematician of the 20th century, but was he really just a mathematician? Looking at his papers it is always possible to find original and pioneering contributions to the various fields he was interested in, where he always put his mathematical "formamentis" and skills at the service of the applications, often extending standard theories and models in order to achieve more general results. Many contributions are also devoted to educational issues, in mathematics in general and in probability and statistics in particular.He really thought that mathematics and, in particular, those topics related to uncertainty, should enter in everyday life as a useful support to everyone's decision making. He always imagined and lived mathematics as a basic tool both for better understanding and describing complex phenomena and for helping decision makers in assuming coherent and feasible actions. His many important contributions to the theory of probability and to mathematical statistics are well known all over the world, thus, in the following, minor, but still pioneering, aspects of his work, related both to theory and to applications of mathematical tools, and to his work in the field of education and training of teachers, are presented., (Copyright 2001 John Wiley & Sons, Ltd.)
- Published
- 2001
- Full Text
- View/download PDF
7. Degrees of necessity and of sufficiency: Further results and extensions, with an application to covid‐19 mortality in Austria
- Author
-
Andreas Gleiss, Robin Henderson, and Michael Schemper
- Subjects
Statistics and Probability ,sufficient condition ,Epidemiology ,Logistic regression ,01 natural sciences ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,explained variation ,Econometrics ,Humans ,Necessity and sufficiency ,030212 general & internal medicine ,0101 mathematics ,Research Articles ,ordinal outcomes ,Mathematics ,Event (probability theory) ,Complement (set theory) ,SARS-CoV-2 ,logistic regression ,necessary condition ,COVID-19 ,Conditional probability ,Explained variation ,Degree (music) ,Regression ,Austria ,Research Article - Abstract
The purpose of this paper is to extend to ordinal and nominal outcomes the measures of degree of necessity and of sufficiency defined by the authors for dichotomous and survival outcomes in a previous paper. A cause, represented by certain values of prognostic factors, is considered necessary for an event if, without the cause, the event cannot develop. It is considered sufficient for an event if the event is unavoidable in the presence of the cause. The degrees of necessity and sufficiency, ranging from zero to one, are simple, intuitive functions of unconditional and conditional probabilities of an event such as disease or death. These probabilities often will be derived from logistic regression models; the measures, however, do not require any particular model. In addition, we study in detail the relationship between the proposed measures and the related explained variation summary for dichotomous outcomes, which are the common root for the developments for ordinal, nominal, and survival outcomes. We introduce and analyze the Austrian covid-19 data, with the aim of quantifying effects of age and other potentially prognostic factors on covid-19 mortality. This is achieved by standard regression methods but also in terms of the newly proposed measures. It is shown how they complement the toolbox of prognostic factor studies, in particular when comparing the importance of prognostic factors of different types. While the full model's degree of necessity is extremely high (0.933), its low degree of sufficiency (0.179) is responsible for the low proportion of explained variation (0.193).
- Published
- 2021
- Full Text
- View/download PDF
8. Analysis of time‐to‐event for observational studies: Guidance to the use of intensity models
- Author
-
Jeremy M. G. Taylor, Per Kragh Andersen, Pierre Joly, Maja Pohar Perme, Michal Abrahamowicz, Torben Martinussen, Hans C. van Houwelingen, Richard J. Cook, and Terry M. Therneau
- Subjects
Statistics and Probability ,Hazard (logic) ,Epidemiology ,Computer science ,STRATOS initiative ,Machine learning ,computer.software_genre ,01 natural sciences ,survival analysis ,010104 statistics & probability ,03 medical and health sciences ,censoring ,0302 clinical medicine ,Bias ,Goodness of fit ,Covariate ,Cox proportional hazards regression ,Cox regression model ,Humans ,030212 general & internal medicine ,0101 mathematics ,Proportional Hazards Models ,business.industry ,Proportional hazards model ,Event (computing) ,immortal time bias ,prediction ,Survival Analysis ,multistate model ,Censoring (clinical trials) ,time-dependent covariates ,Observational study ,Artificial intelligence ,business ,computer ,Mathematics ,Software ,hazard function - Abstract
This paper provides guidance for researchers with some mathematical background on the conduct of time-to-event analysis in observational studies based on intensity (hazard) models. Discussions of basic concepts like time axis, event definition and censoring are given. Hazard models are introduced, with special emphasis on the Cox proportional hazards regression model. We provide check lists that may be useful both when fitting the model and assessing its goodness of fit and when interpreting the results. Special attention is paid to how to avoid problems with immortal time bias by introducing time-dependent covariates. We discuss prediction based on hazard models and difficulties when attempting to draw proper causal conclusions from such models. Finally, we present a series of examples where the methods and check lists are exemplified. Computational details and implementation using the freely available R software are documented in Supplementary Material. The paper was prepared as part of the STRATOS initiative.
- Published
- 2020
- Full Text
- View/download PDF
9. Meta‐analysis of gene‐environment interaction exploiting gene‐environment independence across multiple case‐control studies
- Author
-
Heather M. Stringham, Michael Boehnke, Jason P. Estes, Shi Li, John D. Rice, and Bhramar Mukherjee
- Subjects
0301 basic medicine ,Statistics and Probability ,Biometry ,Epidemiology ,Alpha-Ketoglutarate-Dependent Dioxygenase FTO ,Inverse ,Polymorphism, Single Nucleotide ,Article ,Body Mass Index ,03 medical and health sciences ,Bayes' theorem ,Bias ,Meta-Analysis as Topic ,Statistics ,Econometrics ,Humans ,Computer Simulation ,Gene–environment interaction ,Retrospective Studies ,Mathematics ,Models, Statistical ,Models, Genetic ,Age Factors ,Estimator ,Bayes Theorem ,Variance (accounting) ,Covariance ,Logistic Models ,030104 developmental biology ,Diabetes Mellitus, Type 2 ,Case-Control Studies ,Meta-analysis ,Independence (mathematical logic) ,Gene-Environment Interaction - Abstract
Multiple papers have studied the use of gene-environment (G-E) independence to enhance power for testing gene-environment interaction (GEI) in case-control studies. However, studies that evaluate the role of G-E independence in a meta-analysis framework are limited. In this paper, we extend the single-study empirical-Bayes (EB) type shrinkage estimators proposed by Mukherjee and Chatterjee (2008) to a meta-analysis setting that adjusts for uncertainty regarding the assumption of G-E independence across studies. We use the retrospective likelihood framework to derive an adaptive combination of estimators obtained under the constrained model (assuming G-E independence) and unconstrained model (without assumptions of G-E independence) with weights determined by measures of G-E association derived from multiple studies. Our simulation studies indicate that this newly proposed estimator has improved average performance across different simulation scenarios than the standard alternative of using inverse variance (covariance) weighted estimators that combines study-specific constrained, unconstrained or EB estimators. The results are illustrated by meta-analyzing six different studies of type 2 diabetes (T2D) investigating interactions between genetic markers on the obesity related FTO gene and environmental factors Body Mass Index (BMI) and age.
- Published
- 2017
- Full Text
- View/download PDF
10. Approaches to expanding the two-arm biased coin randomization to unequal allocation while preserving the unconditional allocation ratio
- Author
-
Victoria Plamadeala Johnson and Olga M. Kuznetsova
- Subjects
Statistics and Probability ,030505 public health ,Forcing (recursion theory) ,Randomization ,Epidemiology ,01 natural sciences ,010104 statistics & probability ,03 medical and health sciences ,Statistics ,Probability mass function ,Minification ,0101 mathematics ,0305 other medical science ,Block size ,Block (data storage) ,Mathematics - Abstract
The paper discusses three methods for expanding the biased coin randomization (BCR) to unequal allocation while preserving the unconditional allocation ratio at every step. The first method originally proposed in the contexts of BCR and minimization is based on mapping from an equal allocation multi-arm BCR. Despite the improvement proposed in this paper to ensure tighter adherence to the targeted unequal allocation, this method still distributes the probability mass at least as wide as the permuted block randomization (PBR). This works for smaller block sizes, but for larger block sizes, a tighter control of the imbalance in the treatment assignments is desired. The second method, which has two versions, allows to tighten the distribution of the imbalance compared with that achieved with the PBR. However, the distribution of the imbalance remains considerably wider than that of the brick tunnel randomization - the unequal allocation procedure with the tightest possible imbalance distribution among all allocation ratio preserving procedures with the same allocation ratio. Finally, the third method, the BCR with a preset proportion of maximal forcing, mimics the properties of the equal allocation BCR. With maximum forcing, it approaches the brick tunnel randomization, similar to how 1:1 BCR approaches 1:1 PBR with the permuted block size of 2 (the equal allocation procedure with the lowest possible imbalance) when the bias approaches 1. With minimum forcing, the BCR with a preset proportion of maximal forcing approaches complete randomization (similar to 1:1 BCR). Copyright © 2017 John Wiley & Sons, Ltd.
- Published
- 2017
- Full Text
- View/download PDF
11. Models for zero-inflated, correlated count data with extra heterogeneity: when is it too complex?
- Author
-
Christel Faes, Sammy Chebon, Frank Cools, and Helena Geys
- Subjects
0301 basic medicine ,Statistics and Probability ,Operations research ,Epidemiology ,Model selection ,Random effects model ,Poisson distribution ,01 natural sciences ,010104 statistics & probability ,03 medical and health sciences ,symbols.namesake ,030104 developmental biology ,Overdispersion ,Covariate ,Statistics ,symbols ,Zero-inflated model ,Poisson regression ,0101 mathematics ,Mathematics ,Count data - Abstract
Statistical analysis of count data typically starts with a Poisson regression. However, in many real-life applications, it is observed that the variation in the counts is larger than the mean, and one needs to deal with the problem of overdispersion in the counts. Several factors may contribute to overdispersion: (1) unobserved heterogeneity due to missing covariates, (2) correlation between observations (such as in longitudinal studies), and (3) the occurrence of many zeros (more than expected from the Poisson distribution). In this paper, we discuss a model that allows one to explicitly take each of these factors into consideration. The aim of this paper is twofold: (1) investigate whether we can identify the cause of overdispersion via model selection, and (2) investigate the impact of a misspecification of the model on the power of a covariate. The paper is motivated by a study of the occurrence of drug-induced arrhythmia in beagle dogs based on electrocardiogram recordings, with the objective to evaluate the effect of potential drugs on the heartbeat irregularities. Copyright © 2016 John Wiley & Sons, Ltd.
- Published
- 2016
- Full Text
- View/download PDF
12. Time-dependent summary receiver operating characteristics for meta-analysis of prognostic studies
- Author
-
Satoshi Hattori and Xiao-Hua Zhou
- Subjects
Statistics and Probability ,Epidemiology ,Bayesian probability ,Inference ,Multivariate normal distribution ,Variance (accounting) ,Bivariate analysis ,01 natural sciences ,Binomial distribution ,010104 statistics & probability ,03 medical and health sciences ,Bayes' theorem ,0302 clinical medicine ,Meta-analysis ,Statistics ,Econometrics ,030212 general & internal medicine ,0101 mathematics ,Mathematics - Abstract
Prognostic studies are widely conducted to examine whether biomarkers are associated with patient's prognoses and play important roles in medical decisions. Because findings from one prognostic study may be very limited, meta-analyses may be useful to obtain sound evidence. However, prognostic studies are often analyzed by relying on a study-specific cut-off value, which can lead to difficulty in applying the standard meta-analysis techniques. In this paper, we propose two methods to estimate a time-dependent version of the summary receiver operating characteristics curve for meta-analyses of prognostic studies with a right-censored time-to-event outcome. We introduce a bivariate normal model for the pair of time-dependent sensitivity and specificity and propose a method to form inferences based on summary statistics reported in published papers. This method provides a valid inference asymptotically. In addition, we consider a bivariate binomial model. To draw inferences from this bivariate binomial model, we introduce a multiple imputation method. The multiple imputation is found to be approximately proper multiple imputation, and thus the standard Rubin's variance formula is justified from a Bayesian view point. Our simulation study and application to a real dataset revealed that both methods work well with a moderate or large number of studies and the bivariate binomial model coupled with the multiple imputation outperforms the bivariate normal model with a small number of studies. Copyright © 2016 John Wiley & Sons, Ltd.
- Published
- 2016
- Full Text
- View/download PDF
13. Bayesian methods of confidence interval construction for the population attributable risk from cross-sectional studies
- Author
-
Geoffrey Jones, Cord Heuer, Sarah Pirikahu, and Martin L. Hazelton
- Subjects
Statistics and Probability ,Epidemiology ,Bayesian probability ,Variance (accounting) ,Risk factor (computing) ,01 natural sciences ,Confidence interval ,010104 statistics & probability ,03 medical and health sciences ,Bayes' theorem ,0302 clinical medicine ,Frequentist inference ,Attributable risk ,Statistics ,Econometrics ,030212 general & internal medicine ,0101 mathematics ,Jackknife resampling ,Mathematics - Abstract
Population attributable risk measures the public health impact of the removal of a risk factor. To apply this concept to epidemiological data, the calculation of a confidence interval to quantify the uncertainty in the estimate is desirable. However, because perhaps of the confusion surrounding the attributable risk measures, there is no standard confidence interval or variance formula given in the literature. In this paper, we implement a fully Bayesian approach to confidence interval construction of the population attributable risk for cross-sectional studies. We show that, in comparison with a number of standard Frequentist methods for constructing confidence intervals (i.e. delta, jackknife and bootstrap methods), the Bayesian approach is superior in terms of percent coverage in all except a few cases. This paper also explores the effect of the chosen prior on the coverage and provides alternatives for particular situations. Copyright © 2016 John Wiley & Sons, Ltd.
- Published
- 2016
- Full Text
- View/download PDF
14. Meta-analysis of ratios of sample variances
- Author
-
Robert G. Staudte and Luke A. Prendergast
- Subjects
Statistics and Probability ,Normalization (statistics) ,Epidemiology ,Omnibus test ,05 social sciences ,050401 social sciences methods ,Sample (statistics) ,01 natural sciences ,Data set ,010104 statistics & probability ,0504 sociology ,F-test ,Sample size determination ,Simple (abstract algebra) ,Meta-analysis ,Statistics ,0101 mathematics ,Mathematics - Abstract
When conducting a meta-analysis of standardized mean differences (SMDs), it is common to use Cohen's d, or its variants, that require equal variances in the two arms of each study. While interpretation of these SMDs is simple, this alone should not be used as a justification for assuming equal variances. Until now, researchers have either used an F-test for each individual study or perhaps even conveniently ignored such tools altogether. In this paper, we propose a meta-analysis of ratios of sample variances to assess whether the equality of variances assumptions is justified prior to a meta-analysis of SMDs. Quantile-quantile plots, an omnibus test for equal variances or an overall meta-estimate of the ratio of variances can all be used to formally justify the use of less common methods when evidence of unequal variances is found. The methods in this paper are simple to implement and the validity of the approaches are reinforced by simulation studies and an application to a real data set.
- Published
- 2016
- Full Text
- View/download PDF
15. Centile estimation for a proportion response variable
- Author
-
Robert A. Rigby, Mikis D. Stasinopoulos, Marco Enea, and Abu Hossain
- Subjects
Statistics and Probability ,Estimation ,Distribution (number theory) ,Epidemiology ,Logit ,Skew ,01 natural sciences ,010104 statistics & probability ,03 medical and health sciences ,Variable (computer science) ,0302 clinical medicine ,Unit interval (data transmission) ,030225 pediatrics ,Statistics ,Probability distribution ,Tobit model ,0101 mathematics ,Mathematics - Abstract
This paper introduces two general models for computing centiles when the response variable Y can take values between 0 and 1, inclusive of 0 or 1. The models developed are more flexible alternatives to the beta inflated distribution. The first proposed model employs a flexible four parameter logit skew Student t (logitSST) distribution to model the response variable Y on the unit interval (0, 1), excluding 0 and 1. This model is then extended to the inflated logitSST distribution for Y on the unit interval, including 1. The second model developed in this paper is a generalised Tobit model for Y on the unit interval, including 1. Applying these two models to (1-Y) rather than Y enables modelling of Y on the unit interval including 0 rather than 1. An application of the new models to real data shows that they can provide superior fits.
- Published
- 2015
- Full Text
- View/download PDF
16. A note on bias of measures of explained variation for survival data
- Author
-
Nataša Kejžar, Janez Stare, and Delphine Maucort-Boulch
- Subjects
Statistics and Probability ,Epidemiology ,16. Peace & justice ,Explained variation ,01 natural sciences ,Repeated events ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Survival data ,Censoring (clinical trials) ,Covariate ,Statistics ,Econometrics ,030212 general & internal medicine ,0101 mathematics ,Mathematics - Abstract
Papers evaluating measures of explained variation, or similar indices, almost invariably use independence from censoring as the most important criterion. And they always end up suggesting that some measures meet this criterion, and some do not, most of the time leading to a conclusion that the first is better than the second. As a consequence, users are offered measures that cannot be used with time-dependent covariates and effects, not to mention extensions to repeated events or multi-state models. We explain in this paper that the aforementioned criterion is of no use in studying such measures, because it simply favors those that make an implicit assumption of a model being valid everywhere. Measures not making such an assumption are disqualified, even though they are better in every other respect. We show that if these, allegedly inferior, measures are allowed to make the same assumption, they are easily corrected to satisfy the 'independent-from-censoring' criterion. Even better, it is enough to make such an assumption only for the times greater than the last observed failure time τ, which, in contrast with the 'preferred' measures, makes it possible to use all the modeling flexibility up to τ and assume whatever one wants after τ. As a consequence, we claim that some of the measures being preferred as better in the existing reviews are in fact inferior.
- Published
- 2015
- Full Text
- View/download PDF
17. No solution yet for combining two independent studies in the presence of heterogeneity
- Author
-
Armin Koch, Theodor Framke, Andrea Gonnermann, and Anika Großhennig
- Subjects
Statistics and Probability ,Epidemiology ,Context (language use) ,Biostatistics ,01 natural sciences ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Meta-Analysis as Topic ,Statistics ,Econometrics ,Test statistic ,Humans ,Computer Simulation ,030212 general & internal medicine ,0101 mathematics ,Drug Approval ,Statistical hypothesis testing ,Mathematics ,Models, Statistical ,Fixed effects model ,Random effects model ,3. Good health ,Standard error ,Sample size determination ,Commentary ,Type I and type II errors - Abstract
Meta-analysis plays an important role in the analysis and interpretation of clinical trials in medicine and of trials in the social sciences but is of importance in other fields (e.g., particle physics [1]) as well. In 2001, Hartung and Knapp [2],[3] introduced a new approach to test for a nonzero treatment effect in a meta-analysis of k studies. Hartung and Knapp [2],[3] suggest to use the random effects estimate according to DerSimonian and Laird [4] and propose a variance estimator q so that the test statistics for the treatment effect is t distributed with k − 1 degrees of freedom. In their paper on dichotomous endpoints, results of a simulation study with 6 and 12 studies illustrate for risk differences, log relative risks and log odds ratios, the excellent properties regarding control of the type I error, and the achieved power [2]. They investigate different sample sizes in each study, and different amounts of heterogeneity between studies and compare their new approach (Hartung and Knapp approach (HK)) with the fixed effects approach (FE) and the classical random effects approach by DerSimonian and Laird (DL). It can be clearly seen that, with increasing heterogeneity, the FE as well as the DL does not control the type I error rate, while the HK keeps the type I error rate in nearly every situation and in every scale. Advantages and disadvantages of the two standard approaches and respective test statistics have been extensively discussed (e.g., [5–7]). While it is well known that the FE is too liberal in the presence of heterogeneity, the DL is often thought to be rather conservative because heterogeneity is incorporated into the standard error of the estimate for the treatment effect and this should lead to larger confidence intervals and smaller test statistics for the treatment effect ([8] chapter 9.4.4.3). This was disproved among others by Ziegler and Victor [7], who observed in situations with increasing heterogeneity severe inflation of the type I error for the DerSimonian and Laird test statistic. Notably, the asymptotic properties of this approach will be valid, if both the number of studies and the number of patients per study are large enough ([8] chapter 9.54, [9,10]). Although power issues of meta-analysis tests have received some interest, comparisons between the approaches and the situation with two studies were not the main interest [11,12]. Borenstein et al. ([10], pp. 363/364) recommend the random effects approach in general for meta-analysis and do not recommend meta-analyses of small numbers of studies. However, meta-analyses of few and of even only two trials are of importance. In drug licensing in many instances, two successful phase III clinical trials have to be submitted as pivotal evidence for drug licensing [13], and summarizing the findings of these studies is required according to the International Conference on Harmonisation guidelines E9 and M4E ([14,15]). It is stated that ‘An overall summary and synthesis of the evidence on safety and efficacy from all the reported clinical trials is required for a marketing application [...]. This may be accompanied, when appropriate, by a statistical combination of results’ ([14], p. 31). For the summary, ‘The use of meta-analytic techniques to combine these estimates is often a useful addition, because it allows a more precise overall estimate of the size of the treatment effects to be generated, and provides a complete and concise summary of the results of the trials’ ([14], p. 32). While in standard drug development, this summary will include usually more than two studies; in rare diseases for the same intervention, barely ever more than two studies are available because of the limited number of patients. Likewise, decision making in the context of health technology assessment is based on systematic reviews and meta-analyses. Often in practice, only two studies are considered homogeneous enough from clinical grounds to be included into a meta-analysis and then form the basis for decision making about reimbursement [16]. Despite the fact that meta-analysis is non-experimental observational (secondary) research [17] and p-values should be interpreted with caution, meta-analyses of randomized clinical trials are termed highest-level information in evidence-based medicine and are the recommended basis for decision making [18]. As statistical significance plays an important role in the assessment of the meta-analysis, it is mandatory to understand the statistical properties of the relevant methodology also in a situation, where only two clinical trials are included into a meta-analysis. We found Cochrane reviews including meta-analyses with two studies only, which are considered for evidence-based decision making even in the presence of a large amount of heterogeneity (I2≈75%) [19–21] We repeated the simulation study for dichotomous endpoints of Hartung and Knapp [2] with programs written in R 3.1.0 [22] to compare the statistical properties of the FE, the DL, and the HK for testing the overall treatment effect θ (H0: θ = 0) in a situation with two to six clinical trials. We considered scenarios under the null and alternative hypothesis for the treatment effect with and without underlying heterogeneity. We present the findings for the odds ratio with pC=0.2 and did vary probability of success in the treatment group pT to investigate the type I error and the power characteristics. The total sample size per meta-analysis was kept constant in the different scenarios (n = 480) and n/k number of patients per study to clearly demonstrate the effect of the number of included studies on power and type I error of the various approaches. Likewise, we attempted to avoid problems with zero cell counts or extremely low event rates that may impact on type I error and power as well. I2 was used to describe heterogeneity because thresholds have been published (low: I2=25%, moderate: I2=50%, and high: I2=75%) [23] for the quantification of the degree of heterogeneity with this measure. We termed I2≤15% negligible, and this refers to simulations assuming no heterogeneity (i.e., the fixed effects model). Table I summarizes the results of our simulation study. The well-known anticonservative behavior of the FE and the DL in the presence of even low heterogeneity is visible for small numbers of studies in the meta-analysis. Particularly for the FE, the increase in the type I error is pronounced. With more than four studies even in situations with substantial heterogeneity, the HK perfectly controls the type I error. There is almost no impact on the power of the test in situations with no or low heterogeneity, and overall, it seems as if the only price to be paid for an increased heterogeneity is a reduced power of the test. Table I Overview of the empirical type I error and power. This is in strong contrast to the situation with only two studies. Again, the HK perfectly controls the prespecified type I error. However, even in a homogeneous situation, the power of the meta-analysis test was lower than 15% in situations where the power of the FE and the DL approximates 70% and 60%, respectively. In the presence of even low heterogeneity with the HK, there is not much chance to arrive at a positive conclusion even with substantial treatment effects. Figure 1 summarizes the main finding of our simulation study with k = 2 and 6 studies impressively. Figure 1 (a–d): Influence of heterogeneity in meta-analysis with two and six studies on empirical power. FE, fixed effects approach; DL, DerSimonian and Laird approach; HK, Hartung and Knapp approach. In the left column, simulation results with two studies ... In the homogeneous situation with two studies, the DL and even better the FE can be used to efficiently base conclusions on a meta-analysis. In contrast, already with mild to moderate heterogeneity, both standard tests severely violate the prespecified type I error, and there is a high risk of false positive conclusion with the classical approaches. This has major implications for decision making in drug licensing as well. We have noted previously that a meta-analysis can be confirmatory if a drug development program was designed to include a preplanned meta-analysis of the two pivotal trials [24]. As an example, thrombosis prophylaxis was discussed in the paper by Koch and Rohmel [24], where venous thromboembolism is accepted as primary endpoint in the pivotal trials. In case when both pivotal trials are successful, they can be combined to demonstrate a positive impact on, for example, mortality. This can be preplanned as a hierarchical testing procedure: first, both pivotal trials will be assessed individually before confirmatory conclusions will be based on the meta-analysis. As explained, neither the FE, nor the DL, nor the HK can be the methodology to be recommended for a priori planning in this sensitive area unless any indication for heterogeneity is taken as a trigger not to combine studies in a meta-analysis at all. It is our belief that not enough emphasis has been given to this finding in the original paper and the important role of heterogeneity is not acknowledged enough in the discussion of findings from meta-analyses, in general.
- Published
- 2015
- Full Text
- View/download PDF
18. A Bayesian analysis of quantal bioassay experiments incorporating historical controls via Bayes factors
- Author
-
Andrew Womack, Hongxiao Zhu, Xiaowei Wu, and Luis G. León Novelo
- Subjects
Statistics and Probability ,040301 veterinary sciences ,Epidemiology ,Bayesian probability ,Bayes factor ,04 agricultural and veterinary sciences ,Bayesian inference ,01 natural sciences ,Statistical power ,0403 veterinary science ,010104 statistics & probability ,Probit model ,Statistics ,Econometrics ,Data analysis ,Dose effect ,Bioassay ,0101 mathematics ,Mathematics - Abstract
This paper addresses model-based Bayesian inference in the analysis of data arising from bioassay experiments. In such experiments, increasing doses of a chemical substance are given to treatment groups (usually rats or mice) for a fixed period of time (usually 2 years). The goal of such an experiment is to determine whether an increased dosage of the chemical is associated with increased probability of an adverse effect (usually presence of adenoma or carcinoma). The data consists of dosage, survival time, and the occurrence of the adverse event for each unit in the study. To determine whether such relationship exists, this paper proposes using Bayes factors to compare two probit models, the model that assumes increasing dose effects and the model that assumes no dose effect. These models account for the survival time of each unit through a Poly-k type correction. In order to increase statistical power, the proposed approach allows the incorporation of information from control groups from previous studies. The proposed method is able to handle data with very few occurrences of the adverse event. The proposed method is compared with a variation of the Peddada test via simulation and is shown to have higher power. We demonstrate the method by applying it to the two bioassay experiment datasets previously analyzed by other authors. Copyright © 2017 John Wiley & Sons, Ltd.
- Published
- 2017
- Full Text
- View/download PDF
19. PartialF-tests with multiply imputed data in the linear regression framework via coefficient of determination
- Author
-
Ofer Harel and Ashok Chaurasia
- Subjects
Suicide Prevention ,Statistics and Probability ,Analysis of Variance ,Biometry ,Coefficient of determination ,Injury control ,Epidemiology ,Scalar (mathematics) ,Poison control ,Inversion (meteorology) ,computer.software_genre ,Research Design ,Data Interpretation, Statistical ,Simulated data ,Linear regression ,Linear Models ,Humans ,Regression Analysis ,Applied mathematics ,Computer Simulation ,Data mining ,computer ,Statistical hypothesis testing ,Mathematics - Abstract
Tests for regression coefficients such as global, local, and partial F-tests are common in applied research. In the framework of multiple imputation, there are several papers addressing tests for regression coefficients. However, for simultaneous hypothesis testing, the existing methods are computationally intensive because they involve calculation with vectors and (inversion of) matrices. In this paper, we propose a simple method based on the scalar entity, coefficient of determination, to perform (global, local, and partial) F-tests with multiply imputed data. The proposed method is evaluated using simulated data and applied to suicide prevention data.
- Published
- 2014
- Full Text
- View/download PDF
20. Violations of the independent increment assumption when using generalized estimating equation in longitudinal group sequential trials
- Author
-
Scott S. Emerson and Abigail B. Shoben
- Subjects
Statistics and Probability ,Heteroscedasticity ,Epidemiology ,Accrual ,Statistics as Topic ,Estimator ,Context (language use) ,Correlation ,Efficient estimator ,Clinical Trials, Phase III as Topic ,Epidemiologic Research Design ,Statistics ,Econometrics ,Humans ,Longitudinal Studies ,Generalized estimating equation ,Mathematics ,Type I and type II errors - Abstract
In phase 3 clinical trials, ethical and financial concerns motivate sequential analyses in which the data are analyzed prior to completion of the entire planned study. Existing group sequential software accounts for the effects of these interim analyses on the sampling density by assuming that the contribution of subsequent increments is independent of the contribution from previous data. This independent increment assumption is satisfied in many common circumstances, including when using the efficient estimator. However, certain circumstances may dictate using an inefficient estimator, and the independent increment assumption may then be violated. Consequences of assuming independent increments in a setting where the assumption does not hold have not been previously explored. One important setting in which independent increments may not hold is the setting of longitudinal clinical trials. This paper considers dependent increments that arise because of heteroscedastic and correlated data in the context of longitudinal clinical trials that use a generalized estimating equation (GEE) approach. Both heteroscedasticity over time and correlation of observations within subjects may lead to departures from the independent increment assumption when using GEE. We characterize situations leading to greater departures in this paper. Despite violations of the independent increment assumption, simulation results suggest that operating characteristics of sequential designs are largely maintained for typically observed patterns of accrual, correlation, and heteroscedasticity even when using analyses that use standard software that depends on an independent increment structure. More extreme scenarios may require greater care to avoid departures from the nominal type I error rate and power. Copyright © 2014 John Wiley & Sons, Ltd.
- Published
- 2014
- Full Text
- View/download PDF
21. Step-up testing procedure for multiple comparisons with a control for a latent variable model with ordered categorical responses
- Author
-
Koon Shing Kwong, Siu Hung Cheung, Yueqiong Lin, and Wai Yin Poon
- Subjects
Statistics and Probability ,Clinical Trials as Topic ,Models, Statistical ,Epidemiology ,Lidocaine ,Pain ,Latent class model ,Fentanyl ,Closed testing procedure ,Sample size determination ,Data Interpretation, Statistical ,Sample Size ,Statistics ,Multiple comparisons problem ,Humans ,Computer Simulation ,Holm–Bonferroni method ,Latent variable model ,Categorical variable ,Algorithms ,Mathematics ,Statistical hypothesis testing - Abstract
In clinical studies, multiple comparisons of several treatments to a control with ordered categorical responses are often encountered. A popular statistical approach to analyzing the data is to use the logistic regression model with the proportional odds assumption. As discussed in several recent research papers, if the proportional odds assumption fails to hold, the undesirable consequence of an inflated familywise type I error rate may affect the validity of the clinical findings. To remedy the problem, a more flexible approach that uses the latent normal model with single-step and stepwise testing procedures has been recently proposed. In this paper, we introduce a step-up procedure that uses the correlation structure of test statistics under the latent normal model. A simulation study demonstrates the superiority of the proposed procedure to all existing testing procedures. Based on the proposed step-up procedure, we derive an algorithm that enables the determination of the total sample size and the sample size allocation scheme with a pre-determined level of test power before the onset of a clinical trial. A clinical example is presented to illustrate our proposed method.
- Published
- 2014
- Full Text
- View/download PDF
22. Sample size calculation in cost-effectiveness cluster randomized trials: optimal and maximin approaches
- Author
-
Math J. J. M. Candel, Martijn P. F. Berger, Abu Manju, RS: CAPHRI School for Public Health and Primary Care, RS: CAPHRI - Design and analysis of studies in health sciences, RS: CAPHRI - Health Promotion and Health Communication, and FHML Methodologie & Statistiek
- Subjects
NET-BENEFIT ,Statistics and Probability ,Optimal design ,Epidemiology ,Cost effectiveness ,Cost-Benefit Analysis ,UNCERTAINTY ,Statistical power ,Depression, Postpartum ,Correlation ,STATISTICAL POWER ,Robustness (computer science) ,Statistics ,Econometrics ,Humans ,maximin design ,optimal design ,Randomized Controlled Trials as Topic ,Mathematics ,Models, Statistical ,Multilevel model ,cost-effectiveness analysis ,MULTILEVEL MODELS ,FRAMEWORK ,Minimax ,cluster randomized trials ,sample size calculation ,OPTIMAL-DESIGN ,Sample size determination ,Sample Size ,Quality of Life ,Female ,CLINICAL-TRIALS - Abstract
In this paper, the optimal sample sizes at the cluster and person levels for each of two treatment arms are obtained for cluster randomized trials where the cost-effectiveness of treatments on a continuous scale is studied. The optimal sample sizes maximize the efficiency or power for a given budget or minimize the budget for a given efficiency or power. Optimal sample sizes require information on the intra-cluster correlations (ICCs) for effects and costs, the correlations between costs and effects at individual and cluster levels, the ratio of the variance of effects translated into costs to the variance of the costs (the variance ratio), sampling and measuring costs, and the budget. When planning, a study information on the model parameters usually is not available. To overcome this local optimality problem, the current paper also presents maximin sample sizes. The maximin sample sizes turn out to be rather robust against misspecifying the correlation between costs and effects at the cluster and individual levels but may lose much efficiency when misspecifying the variance ratio. The robustness of the maximin sample sizes against misspecifying the ICCs depends on the variance ratio. The maximin sample sizes are robust under misspecification of the ICC for costs for realistic values of the variance ratio greater than one but not robust under misspecification of the ICC for effects. Finally, we show how to calculate optimal or maximin sample sizes that yield sufficient power for a test on the cost-effectiveness of an intervention.
- Published
- 2014
- Full Text
- View/download PDF
23. Joint confidence region estimation for area underROCcurve and Youden index
- Author
-
Lili Tian and Jingjing Yin
- Subjects
Statistics and Probability ,Index (economics) ,CA-19-9 Antigen ,Epidemiology ,Youden's J statistic ,Biostatistics ,Statistics, Nonparametric ,Direct measure ,Statistics ,Confidence Intervals ,Humans ,Computer Simulation ,Mathematics ,Parametric statistics ,Confidence region ,Estimation ,Models, Statistical ,business.industry ,Pattern recognition ,Power transform ,Pancreatic Neoplasms ,Data set ,ROC Curve ,Area Under Curve ,CA-125 Antigen ,Case-Control Studies ,Artificial intelligence ,business ,Algorithms ,Biomarkers - Abstract
In the field of diagnostic studies, the area under the ROC curve (AUC) serves as an overall measure of a biomarker/diagnostic test's accuracy. Youden index, defined as the overall correct classification rate minus one at the optimal cut-off point, is another popular index. For continuous biomarkers of binary disease status, although researchers mainly evaluate the diagnostic accuracy using AUC, for the purpose of making diagnosis, Youden index provides an important and direct measure of the diagnostic accuracy at the optimal threshold and hence should be taken into consideration in addition to AUC. Furthermore, AUC and Youden index are generally correlated. In this paper, we initiate the idea of evaluating diagnostic accuracy based on AUC and Youden index simultaneously. As the first step toward this direction, this paper only focuses on the confidence region estimation of AUC and Youden index for a single marker. We present both parametric and non-parametric approaches for estimating joint confidence region of AUC and Youden index. We carry out extensive simulation study to evaluate the performance of the proposed methods. In the end, we apply the proposed methods to a real data set.
- Published
- 2013
- Full Text
- View/download PDF
24. Exact inference for adaptive group sequential designs
- Author
-
Cyrus R. Mehta, Ping Gao, and Lingyun Liu
- Subjects
Statistics and Probability ,Biometrics ,Epidemiology ,Estimation theory ,Deep Brain Stimulation ,Inference ,Parkinson Disease ,Function (mathematics) ,Confidence interval ,Research Design ,Sample size determination ,Data Interpretation, Statistical ,Sample Size ,Statistics ,Confidence Intervals ,Quality of Life ,Test statistic ,Humans ,Computer Simulation ,Point estimation ,Mathematics - Abstract
Methods for controlling the type-1 error of an adaptive group sequential trial were developed in seminal papers by Cui, Hung, and Wang (Biometrics, 1999), Lehmacher and Wassmer (Biometrics, 1999), and Müller and Schäfer (Biometrics, 2001). However, corresponding solutions for the equally important and related problem of parameter estimation at the end of the adaptive trial have not been completely satisfactory. In this paper, a method is provided for computing a two-sided confidence interval having exact coverage, along with a point estimate that is median unbiased for the primary efficacy parameter in a two-arm adaptive group sequential design. The possible adaptations are not only confined to sample size alterations but also include data-dependent changes in the number and spacing of interim looks and changes in the error spending function. The procedure is based on mapping the final test statistic obtained in the modified trial into a corresponding backward image in the original trial. This is an advance on previously available methods, which either produced conservative coverage and no point estimates or provided exact coverage for one-sided intervals only.
- Published
- 2013
- Full Text
- View/download PDF
25. Estimating treatment effects in a two-arm parallel trial of a continuous outcome
- Author
-
Allan Clark, Ian Nunney, and Lee Shepstone
- Subjects
Statistics and Probability ,Epidemiology ,Cumulative distribution function ,Treatment outcome ,Kernel density estimation ,Outcome (probability) ,Confidence interval ,Normal distribution ,Treatment Outcome ,Strictly standardized mean difference ,Statistics ,Confidence Intervals ,Econometrics ,Humans ,Computer Simulation ,Randomized Controlled Trials as Topic ,Mathematics - Abstract
For a continuous outcome in a two-arm trial that satisfies normal distribution assumptions, we can transform the standardized mean difference with the use of the cumulative distribution function to be the effect size measure P(X Y ). This measure is already established within engineering as the reliability parameter in stress-strength models, where Y represents the strength of a component and X represents the stress the component undergoes. If X is greater than Y, then the component will fail. In this paper, we consider the closely related effect size measure, [Formula: see text] This measure is also known as Somer's d, which was introduced by Somers in 1962 as an ordinal measure of association. In this paper, we explore this measure as a treatment effect size for a continuous outcome. Although the point estimates for λ are easily calculated, the interval is not so readily obtained. We compare kernel density estimation and use of bootstrap and jackknife methods to estimate confidence intervals against two further methods for estimating P(X Y ) and their respective intervals, one of which makes no assumption about the underlying distribution and the other assumes a normal distribution. Simulations show that the choice of the best estimator depends on the value of λ, the variability within the data, and the underlying distribution of the data.
- Published
- 2012
- Full Text
- View/download PDF
26. Event-weighted proportional hazards modelling for recurrent gap time data
- Author
-
Stephanie N. Dixon and Gerarda Darlington
- Subjects
Statistics and Probability ,Likelihood Functions ,Time Factors ,Epidemiology ,Proportional hazards model ,Covariance matrix ,Inverse ,Mammary Neoplasms, Animal ,Time data ,Marginal model ,Rats ,Resampling ,Statistics ,Econometrics ,Animals ,Cluster Analysis ,Computer Simulation ,Female ,Neoplasm Recurrence, Local ,Independence (probability theory) ,Proportional Hazards Models ,Mathematics ,Event (probability theory) - Abstract
The analysis of gap times in recurrent events requires an adjustment to standard marginal models. One can perform this adjustment with a modified within-cluster resampling technique; however, this method is computationally intensive. In this paper, we describe a simple adjustment to the standard Cox proportional hazards model analysis that mimics the intent of within-cluster resampling and results in similar parameter estimates. This method essentially weights the partial likelihood contributions by the inverse of the number of gap times observed within the individual while assuming a working independence correlation matrix. We provide an example involving recurrent mammary tumours in female rats to illustrate the methods considered in this paper. Copyright © 2012 John Wiley & Sons, Ltd.
- Published
- 2012
- Full Text
- View/download PDF
27. Adaptive extensions of a two-stage group sequential procedure for testing primary and secondary endpoints (II): sample size re-estimation
- Author
-
Ajit C. Tamhane, Cyrus R. Mehta, and Yi Wu
- Subjects
Statistics and Probability ,Models, Statistical ,Endpoint Determination ,Epidemiology ,Pocock boundary ,Word error rate ,Confidence interval ,Closed testing procedure ,Bias ,Research Design ,Sample size determination ,Data Interpretation, Statistical ,Sample Size ,Multiple comparisons problem ,Statistics ,Confidence Intervals ,Humans ,Computer Simulation ,Holm–Bonferroni method ,Algorithm ,Sufficient statistic ,Randomized Controlled Trials as Topic ,Mathematics - Abstract
In this Part II of the paper on adaptive extensions of a two-stage group sequential procedure (GSP) proposed by Tamhane, Mehta and Liu [1] (referred to as TML hereafter) for testing a primary and a secondary endpoint, we focus on the second stage sample size re-estimation based on thefirst stage data. First we show that if we use the Cui, Huang and Wang [2] (referred to as CHW hereafter) statistics at the second stage then we can use the same primary and the secondary boundaries as for the original procedure (without sample size re-estimation) and still control the type I familywise error rate (FWER). This extends their result for the single endpoint case. We further show that the secondary boundary can be sharpened in this case by taking the unknown correlation coefficient ρ between the primary and secondary endpoints into account through the use of the confidence limit method proposed in Part I of this paper [3]. If we use the sufficient statistics instead of the CHW statistics then we need to modify both the primary and secondary boundaries; otherwise the error rate can get inflated. We show how to modify the boundaries of the original GSP to control the FWER. Power comparisons between competing procedures are provided. The procedures are illustrated with a clinical trial example. Copyright c ⃝ 0000 John Wiley & Sons, Ltd.
- Published
- 2012
- Full Text
- View/download PDF
28. Adaptive extensions of a two-stage group sequential procedure for testing primary and secondary endpoints (I): unknown correlation between the endpoints
- Author
-
Yi Wu, Cyrus R. Mehta, and Ajit C. Tamhane
- Subjects
Statistics and Probability ,Models, Statistical ,Correlation coefficient ,Endpoint Determination ,Epidemiology ,Pocock boundary ,Boundary (topology) ,Multivariate normal distribution ,Bias ,Research Design ,Sample size determination ,Data Interpretation, Statistical ,Sample Size ,Multiple comparisons problem ,Statistics ,Confidence Intervals ,Humans ,Applied mathematics ,Nuisance parameter ,Computer Simulation ,Constant (mathematics) ,Randomized Controlled Trials as Topic ,Mathematics - Abstract
In a previous paper we studied a two-stage group sequential procedure (GSP) for testing primary and secondary endpoints where the primary endpoint serves as a gatekeeper for the secondary endpoint. We assumed a simple setup of a bivariate normal distribution for the two endpoints with the correlation coefficient ρ between them being either an unknown nuisance parameter or a known constant. Under the former assumption, we used the least favorable value of ρ = 1 to compute the critical boundaries of a conservative GSP. Under the latter assumption, we computed the critical boundaries of an exact GSP. However, neither assumption is very practical. The ρ = 1 assumption is too conservative resulting in loss of power, whereas the known ρ assumption is never true in practice. In this part I of a two-part paper on adaptive extensions of this two-stage procedure (part II deals with sample size re-estimation), we propose an intermediate approach that uses the sample correlation coefficient r from the first-stage data to adaptively adjust the secondary boundary after accounting for the sampling error in r via an upper confidence limit on ρ by using a method due to Berger and Boos. We show via simulation that this approach achieves 5–11% absolute secondary power gain for ρ ≤0.5. The preferred boundary combination in terms of high primary as well as secondary power is that of O'Brien and Fleming for the primary and of Pocock for the secondary. The proposed approach using this boundary combination achieves 72–84% relative secondary power gain (with respect to the exact GSP that assumes known ρ). We give a clinical trial example to illustrate the proposed procedure. Copyright © 2012 John Wiley & Sons, Ltd.
- Published
- 2012
- Full Text
- View/download PDF
29. Sample size determination for quadratic inference functions in longitudinal design with dichotomous outcomes
- Author
-
Peter X.-K. Song and Youna Hu
- Subjects
Statistics and Probability ,Clinical Trials as Topic ,Models, Statistical ,Epidemiology ,Covariance matrix ,Inference ,Numerical Analysis, Computer-Assisted ,Marginal model ,Wald test ,Statistical power ,Sample size determination ,Sample Size ,Statistics ,Covariate ,Humans ,Longitudinal Studies ,Generalized estimating equation ,Mathematics - Abstract
Quadratic inference functions (QIF) methodology is an important alternative to the generalized estimating equations (GEE) method in the longitudinal marginal model, as it offers higher estimation efficiency than the GEE when correlation structure is misspecified. The focus of this paper is on sample size determination and power calculation for QIF based on the Wald test in a marginal logistic model with covariates of treatment, time, and treatment-time interaction. We have made three contributions in this paper: (i) we derived formulas of sample size and power for QIF and compared their performance with those given by the GEE; (ii) we proposed an optimal scheme of sample size determination to overcome the difficulty of unknown true correlation matrix in the sense of minimal average risk; and (iii) we studied properties of both QIF and GEE sample size formulas in relation to the number of follow-up visits and found that the QIF gave more robust sample sizes than the GEE. Using numerical examples, we illustrated that without sacrificing statistical power, the QIF design leads to sample size saving and hence lower study cost in comparison with the GEE analysis. We conclude that the QIF analysis is appealing for longitudinal studies.
- Published
- 2012
- Full Text
- View/download PDF
30. Preserving the allocation ratio at every allocation with biased coin randomization and minimization in studies with unequal allocation
- Author
-
Olga M. Kuznetsova and Yevgen Tymofyeyev
- Subjects
Statistics and Probability ,Selection bias ,Random allocation ,Models, Statistical ,Randomization ,Epidemiology ,media_common.quotation_subject ,Random Allocation ,Resampling ,Covariate ,Statistics ,Humans ,Computer Simulation ,Minification ,Special care ,Randomized Controlled Trials as Topic ,Mathematics ,media_common - Abstract
The demand for unequal allocation in clinical trials is growing. Most commonly, the unequal allocation is achieved through permuted block randomization. However, other allocation procedures might be required to better approximate the allocation ratio in small samples, reduce the selection bias in open-label studies, or balance on baseline covariates. When these allocation procedures are generalized to unequal allocation, special care is to be taken to preserve the allocation ratio at every allocation step. This paper offers a way to expand the biased coin randomization to unequal allocation that preserves the allocation ratio at every allocation. The suggested expansion works with biased coin randomization that balances only on treatment group totals and with covariate-adaptive procedures that use a random biased coin element at every allocation. Balancing properties of the allocation ratio preserving biased coin randomization and minimization are described through simulations. It is demonstrated that these procedures are asymptotically protected against the shift in the rerandomization distribution identified for some examples of minimization with 1:2 allocation. The asymptotic shift in the rerandomization distribution of the difference in treatment means for an arbitrary unequal allocation procedure is explicitly derived in the paper.
- Published
- 2011
- Full Text
- View/download PDF
31. Incorporating scientific knowledge into phenotype development: Penalized latent class regression
- Author
-
Elizabeth Garrett-Mayer, Karen Bandeen-Roche, Peter P. Zandi, and Jeannie Marie S. Leoutsakos
- Subjects
Statistics and Probability ,Epidemiology ,Bayesian probability ,Latent variable ,Machine learning ,computer.software_genre ,Article ,Bayes' theorem ,Lasso (statistics) ,Covariate ,Econometrics ,Humans ,Computer Simulation ,Latent variable model ,Aged ,Mathematics ,Models, Statistical ,Models, Genetic ,Probabilistic latent semantic analysis ,business.industry ,Bayes Theorem ,Latent class model ,Phenotype ,Artificial intelligence ,business ,computer - Abstract
The field of psychiatric genetics is hampered by the lack of a clear taxonomy for disorders. Building on the work of Houseman and colleagues (Feature-specific penalized latent class analysis for genomic data. Harvard University Biostatistics Working Paper Series, Working Paper 22, 2005), we describe a penalized latent class regression aimed at allowing additional scientific information to influence the estimation of the measurement model, while retaining the standard assumption of non-differential measurement. In simulation studies, ridge and LASSO penalty functions improved the precision of estimates and, in some cases of differential measurement, also reduced bias. Class-specific penalization enhanced separation of latent classes with respect to covariates, but only in scenarios where there was a true separation. Penalization proved to be less computationally intensive than an analogous Bayesian analysis by a factor of 37. This methodology was then applied to data from normal elderly subjects from the Cache County Study on Memory and Aging. Addition of APO-E genotype and a number of baseline clinical covariates improved the dementia prediction utility of the latent classes; application of class-specific penalization improved precision while retaining that prediction utility. This methodology may be useful in scenarios with large numbers of collinear covariates or in certain cases where latent class model assumptions are violated. Investigation of novel penalty functions may prove fruitful in further refining psychiatric phenotypes.
- Published
- 2010
- Full Text
- View/download PDF
32. Studentt-tests for potentially abnormal data
- Author
-
Jonathan J. Shuster
- Subjects
Statistics and Probability ,Clinical Trials as Topic ,Biometry ,Models, Statistical ,Wilcoxon signed-rank test ,Epidemiology ,business.industry ,Sampling Studies ,Article ,Potentially abnormal ,Software ,Sample size determination ,Robustness (computer science) ,Data Interpretation, Statistical ,Statistics ,Econometrics ,Humans ,Probability distribution ,Welch–Satterthwaite equation ,business ,Mathematics ,System software - Abstract
When the one sample or two-sample t-test is either taught in the class room, or applied in practice to small samples, there is considerable divergence of opinion as to whether or not the inferences drawn are valid. Many point to the “Robustness” of the t-test to violations of assumptions, while others use rank or other robust methods because they believe the t-test is not robust against violations of such assumptions. It is quite likely, despite the apparent divergence of these two opinions, that both arguments have considerable merit. If we agree that this question cannot possibly be resolved in general, the issue becomes one of determining, before any actual data have been collected, whether the t-test will or will not be robust in a specific application. This paper describes Statistical Analysis System (SAS) software, covering a large collection of potential input probability distributions, to investigate both the null and power properties of various one and two sample t-tests and their normal approximations, as well as the Wilcoxon two-sample and sign-rank one sample tests, allowing potential practitioners to determine, at the study design stage, whether the t-test will be robust in their specific application. Sample size projections, based on these actual distributions, are also included. This paper is not intended as a tool to assess robustness after the data have been collected.
- Published
- 2009
- Full Text
- View/download PDF
33. Score and profile likelihood confidence intervals for contingency table parameters
- Author
-
Joseph B. Lang
- Subjects
Statistics and Probability ,Contingency table ,Score test ,Likelihood Functions ,Epidemiology ,Restricted maximum likelihood ,Research ,Robust confidence intervals ,Statistics ,Confidence Intervals ,Confidence distribution ,Binomial proportion confidence interval ,Likelihood function ,Algorithms ,CDF-based nonparametric confidence interval ,Mathematics - Abstract
A straightforward approach to computing score and profile likelihood confidence intervals for contingency table parameters is described. The computational approach herein avoids two main limitations of existing methods: (1) Compared with existing methods, this paper's approach is applicable to a much broader class of parameters. (2) Unlike existing methods, this paper's approach is not case-specific and, hence, lends itself to a general, yet very simple, computational algorithm. This paper describes the 'sliding quadratic' computational algorithm and illustrates its use on examples that have not been previously considered in the literature. A small-scale simulation study highlights the advantage of using score and profile likelihood intervals rather than Wald intervals.
- Published
- 2008
- Full Text
- View/download PDF
34. Nonparametric methods for measurements below detection limit
- Author
-
Cun-Hui Zhang, Chunpeng Fan, Juan Zhang, and Donghui Zhang
- Subjects
Statistics and Probability ,Statistics::Theory ,Wilcoxon signed-rank test ,Epidemiology ,Interval estimation ,Nonparametric statistics ,Sample size determination ,Censoring (clinical trials) ,Statistics ,Statistical inference ,Statistics::Methodology ,Tobit model ,Parametric statistics ,Mathematics - Abstract
Analytical data are often subject to left-censoring when the actual values to be quantified fall below the limit of detection. The primary interest of this paper is statistical inference for the two-sample problem. Most of the current publications are centered around naive approaches or the parametric Tobit model approach. These methods may not be suitable for data with high censoring rates and relatively small sample sizes. In this paper, we establish the theoretical equivalence of three nonparametric methods: the Wilcoxon rank sum, the Gehan, and the Peto-Peto tests, under fixed left-censoring and other mild conditions. We then develop a nonparametric point and interval estimation procedure for the location shift model. A large set of simulations compares 14 methods including naive, parametric, and nonparametric methods. The results clearly favor the nonparametric methods for a range of sample sizes and censoring rates. Simulations also demonstrate satisfactory point and interval estimation results. Finally, a real data example is given followed by discussion.
- Published
- 2008
- Full Text
- View/download PDF
35. Compatible simultaneous lower confidence bounds for the Holm procedure and other Bonferroni-based closed tests
- Author
-
Klaus Strassburger and Frank Bretz
- Subjects
Statistics and Probability ,Biometry ,Epidemiology ,Test procedures ,Intersection (set theory) ,Fixed sequence ,Confidence interval ,symbols.namesake ,Bonferroni correction ,Data Interpretation, Statistical ,Confidence bounds ,Statistics ,Confidence Intervals ,symbols ,Holm–Bonferroni method ,Null hypothesis ,Algorithms ,Mathematics - Abstract
We consider the problem of simultaneously testing multiple one-sided null hypotheses. Single-step procedures, such as the Bonferroni test, are characterized by the fact that the rejection or non-rejection of a null hypothesis does not take the decision for any other hypothesis into account. For stepwise test procedures, such as the Holm procedure, the rejection or non-rejection of a null hypothesis may depend on the decision of other hypotheses. It is well known that stepwise test procedures are by construction more powerful than their single-step counterparts. This power advantage, however, comes only at the cost of increased difficulties in constructing compatible simultaneous confidence intervals for the parameters of interest. For example, such simultaneous confidence intervals are easily obtained for the Bonferroni method, but surprisingly hard to derive for the Holm procedure. In this paper, we discuss the inherent problems and show that ad hoc solutions used in practice typically do not control the pre-specified simultaneous confidence level. Instead, we derive simultaneous confidence intervals that are compatible with a certain class of closed test procedures using weighted Bonferroni tests for each intersection hypothesis. The class of multiple test procedures covered in this paper includes gatekeeping procedures based on Bonferroni adjustments, fixed sequence procedures, the simple weighted or unweighted Bonferroni procedure by Holm and the fallback procedure. We illustrate the results with a numerical example.
- Published
- 2008
- Full Text
- View/download PDF
36. Formulating tightest bounds on causal effects in studies with unmeasured confounders
- Author
-
Zhihong Cai and Manabu Kuroki
- Subjects
Statistics and Probability ,Counterfactual conditional ,Linear programming ,Epidemiology ,Semantics (computer science) ,Simple (abstract algebra) ,Probabilistic logic ,Econometrics ,Observational study ,Monotonic function ,Outcome (probability) ,Mathematics - Abstract
This paper considers the problem of evaluating the causal effect of an exposure on an outcome in observational studies with both measured and unmeasured confounders between the exposure and the outcome. Under such a situation, MacLehose et al. (Epidemiology 2005; 16:548-555) applied linear programming optimization software to find the minimum and maximum possible values of the causal effect for specific numerical data. In this paper, we apply the symbolic Balke-Pearl linear programming method (Probabilistic counterfactuals: semantics, computation, and applications. Ph.D. Thesis, UCLA Cognitive Systems Laboratory, 1995; J. Amer. Statist. Assoc. 1997; 92:1172-1176) to derive the simple closed-form expressions for the lower and upper bounds on causal effects under various assumptions of monotonicity. These universal bounds enable epidemiologists and medical researchers to assess causal effects from observed data with minimum computational effort, and they further shed light on the accuracy of the assessment.
- Published
- 2008
- Full Text
- View/download PDF
37. A more powerful exact test of noninferiority from binary matched‐pairs data
- Author
-
Max Moldovan, Chris Lloyd, Lloyd, Rachel, and Moldovan, Max
- Subjects
Statistics and Probability ,Models, Statistical ,Biometrics ,Medical treatment ,Epidemiology ,Matched-Pair Analysis ,Binary number ,Score ,Maximization ,Placebos ,Exact test ,Treatment Outcome ,Research Design ,Data Interpretation, Statistical ,Statistics ,Humans ,Nuisance parameter ,Likelihood ratio statistic ,Randomized Controlled Trials as Topic ,Mathematics - Abstract
Assessing the therapeutic noninferiority of one medical treatment compared with another is often based on the difference in response rates from a matched binary pairs design. This paper develops a new exact unconditional test for noninferiority that is more powerful than available alternatives. There are two new elements presented in this paper. First, we introduce the likelihood ratio statistic as an alternative to the previously proposed score statistic of Nam (Biometrics 1997; 53:1422-1430). Second, we eliminate the nuisance parameter by estimation followed by maximization as an alternative to the partial maximization of Berger and Boos (Am. Stat. Assoc. 1994; 89:1012-1016) or traditional full maximization. Based on an extensive numerical study, we recommend tests based on the score statistic, the nuisance parameter being controlled by estimation followed by maximization.
- Published
- 2008
- Full Text
- View/download PDF
38. Semiparametric Bayesian analysis of structural equation models with fixed covariates
- Author
-
Xinyuan Song, Bin Lu, and Sik-Yum Lee
- Subjects
Statistics and Probability ,Models, Statistical ,Epidemiology ,Bayes Theorem ,Latent variable ,Models, Biological ,Latent Dirichlet allocation ,Structural equation modeling ,Latent class model ,Dirichlet process ,symbols.namesake ,Diabetes Mellitus, Type 2 ,Creatinine ,Prior probability ,Econometrics ,symbols ,Albuminuria ,Humans ,Computer Simulation ,Diabetic Nephropathies ,Latent variable model ,Gibbs sampling ,Mathematics - Abstract
Latent variables play the most important role in structural equation modeling. In almost all existing structural equation models (SEMs), it is assumed that the distribution of the latent variables is normal. As this assumption is likely to be violated in many biomedical researches, a semiparametric Bayesian approach for relaxing it is developed in this paper. In the context of SEMs with covariates, we provide a general Bayesian framework in which a semiparametric hierarchical modeling with an approximate truncation Dirichlet process prior distribution is specified for the latent variables. The stick-breaking prior and the blocked Gibbs sampler are used for efficient simulation in the posterior analysis. The developed methodology is applied to a study of kidney disease in diabetes patients. A simulation study is conducted to reveal the empirical performance of the proposed approach. Supplementary electronic material for this paper is available in Wiley InterScience at http://www.mrw.interscience.wiley.com/suppmat/1097-0258/suppmat/.
- Published
- 2008
- Full Text
- View/download PDF
39. Sample size and optimal design for logistic regression with binary interaction
- Author
-
Eugene Demidenko
- Subjects
Statistics and Probability ,Score test ,Likelihood Functions ,Epidemiology ,Environment ,Logistic regression ,Wald test ,Asthma ,Logistic Models ,Research Design ,Sample size determination ,Case-Control Studies ,Sample Size ,Likelihood-ratio test ,Statistics ,Linear regression ,Covariate ,Genetics ,Econometrics ,Humans ,Statistics::Methodology ,Statistic ,Mathematics - Abstract
There is no consensus on what test to use as the basis for sample size determination and power analysis. Some authors advocate the Wald test and some the likelihood-ratio test. We argue that the Wald test should be used because the Z-score is commonly applied for regression coefficient significance testing and therefore the same statistic should be used in the power function. We correct a widespread mistake on sample size determination when the variance of the maximum likelihood estimate (MLE) is estimated at null value. In our previous paper, we developed a correct sample size formula for logistic regression with single exposure (Statist. Med. 2007; 26(18):3385-3397). In the present paper, closed-form formulas are derived for interaction studies with binary exposure and covariate in logistic regression. The formula for the optimal control-case ratio is derived such that it maximizes the power function given other parameters. Our sample size and power calculations with interaction can be carried out online at www.dartmouth.edu/ approximately eugened.
- Published
- 2007
- Full Text
- View/download PDF
40. Adaptive, group sequential and decision theoretic approaches to sample size determination
- Author
-
Cyrus R. Mehta and Nitin R. Patel
- Subjects
Statistics and Probability ,Clinical Trials as Topic ,Mathematical optimization ,Bayes estimator ,Models, Statistical ,Epidemiology ,Magnitude (mathematics) ,computer.software_genre ,Sample size determination ,Sample Size ,Group sequential ,Humans ,Data mining ,computer ,Mathematics ,Type I and type II errors - Abstract
This paper presents two adaptive methods for sample size re-estimation within a unified group sequential framework. The conceptual and practical distinction between these adaptive modifications and more traditional sample size changes due to revised estimates of nuisance parameters is highlighted. The motivation for the adaptive designs is discussed. Having established that adaptive sample size modifications can be made without inflating the type 1 error, the paper concludes with a novel decision theoretic approach for determining the magnitude of the sample size modification.
- Published
- 2006
- Full Text
- View/download PDF
41. Causal conclusions are most sensitive to unobserved binary covariates
- Author
-
Abba M. Krieger and Liansheng Wang
- Subjects
Statistics and Probability ,Time-varying covariate ,Lung Neoplasms ,Models, Statistical ,Epidemiology ,Smoking ,Binary number ,Context (language use) ,Shock, Septic ,Bias ,Case-Control Studies ,Data Interpretation, Statistical ,Thromboembolism ,Covariate ,Statistics ,Econometrics ,Humans ,Tampons, Surgical ,Statistics::Methodology ,Female ,Observational study ,Sensitivity (control systems) ,Contraceptives, Oral ,Unit interval ,Mathematics - Abstract
There is a rich literature that considers whether an observed relation between treatment and response is due to an unobserved covariate. In order to quantify this unmeasured bias, an assumption is made about the distribution of this unobserved covariate; typically that it is either binary or at least confined to the unit interval. In this paper, this assumption is relaxed in the context of matched pairs with binary treatment and response. One might think that a long-tailed unobserved covariate could do more damage. Remarkably that is not the case: the most harm is done by a binary covariate, so the case commonly considered in the literature is most conservative. This has two practical consequences: (i) it is always safe to assume that an unobserved covariate is binary, if one is content to make a conservative statement; (ii) when another assumption seems more appropriate, say normal covariate, there will be less sensitivity than with a binary covariate. This assumption implies that it is possible that a relation between treatment and response that is sensitive to unmeasured bias (if the unobserved covariate is dichotomous), ceases to be sensitive if the unobserved covariate is normally distributed. These ideas are illustrated by three examples. It is important to note that the claim in this paper applies to our specific setting of matched pairs with binary treatment and response. Whether the same conclusion holds in other settings is an open question.
- Published
- 2006
- Full Text
- View/download PDF
42. Multiplicity adjustment for multiple endpoints in clinical trials with multiple doses of an active treatment
- Author
-
Tom Capizzi, Hui Quan, and Xiaohui Luo
- Subjects
Statistics and Probability ,Analgesics ,Multivariate analysis ,Dose-Response Relationship, Drug ,Epidemiology ,Data interpretation ,Familywise error rate ,Multiple dosing ,Clinical trial ,symbols.namesake ,Multiple Sclerosis, Relapsing-Remitting ,Bonferroni correction ,Clinical Trials, Phase III as Topic ,Drug Therapy ,Data Interpretation, Statistical ,Multivariate Analysis ,Statistics ,symbols ,Humans ,Computer Simulation ,Active treatment ,Mitoxantrone ,Mathematics ,Type I and type II errors - Abstract
Frequently, multiple doses of an active treatment and multiple endpoints are simultaneously considered in the designs of clinical trials. For these trials, traditional multiplicity adjustment procedures such as Bonferroni, Hochberg and Hommel procedures can be applied when treating the comparisons of different doses to the control on all endpoints at the same level. However, these approaches will not take into account the possible dose-response relationship on each endpoint, and therefore are less specific and may have lower power. To gain power, in this paper, we consider the problem as a two-dimensional multiplicity problem: one dimension concerns the multiple doses and the other dimension concerns the multiple endpoints. We propose procedures which consider the dose order to form the closure of the procedures and control the family-wise type I error rate in a strong sense. For this two-dimensional problem, numerical examples show that procedures proposed in this paper in general have higher power than the commonly used procedures (e.g. the regular Hochberg procedure) especially for comparing the higher dose to the control.
- Published
- 2005
- Full Text
- View/download PDF
43. An appraisal of methods for the analysis of longitudinal categorical data with MAR drop-outs
- Author
-
R. J. O'Hara Hines and W. G. S. Hines
- Subjects
Statistics and Probability ,Likelihood Functions ,Biometry ,Models, Statistical ,Epidemiology ,Mental Disorders ,Multivariate normal distribution ,Estimating equations ,Studentized residual ,Missing data ,Fluvoxamine ,Sample size determination ,Data Interpretation, Statistical ,Binary data ,Statistics ,Odds Ratio ,Econometrics ,Humans ,Longitudinal Studies ,Generalized estimating equation ,Categorical variable ,Selective Serotonin Reuptake Inhibitors ,Mathematics - Abstract
A number of methods for analysing longitudinal ordinal categorical data with missing-at-random drop-outs are considered. Two are maximum-likelihood methods (MAXLIK) which employ marginal global odds ratios to model associations. The remainder use weighted or unweighted generalized estimating equations (GEE). Two of the GEE use Cholesky-decomposed standardized residuals to model the association structure, while another three extend methods developed for longitudinal binary data in which the association structures are modelled using either Gaussian estimation, multivariate normal estimating equations or conditional residuals. Simulated data sets were used to discover differences among the methods in terms of biases, variances and convergence rates when the association structure is misspecified. The methods were also applied to a real medical data set. Two of the GEE methods, referred to as Cond and ML-norm in this paper and by their originators, were found to have relatively good convergence rates and mean squared errors for all sample sizes (80, 120, 300) considered, and one more, referred to as MGEE in this paper and by its originators, worked fairly well for all but the smallest sample size, 80.
- Published
- 2005
- Full Text
- View/download PDF
44. Retrospective analysis of case-control studies when the population is in Hardy–Weinberg equilibrium
- Author
-
Kuang Fu Cheng and W. J. Lin
- Subjects
Statistics and Probability ,Alcohol Drinking ,Epidemiology ,Population ,Inference ,Logistic regression ,Statistics ,Econometrics ,Humans ,education ,Alleles ,Retrospective Studies ,Mathematics ,Mouth neoplasm ,Likelihood Functions ,education.field_of_study ,Models, Statistical ,Models, Genetic ,Alcohol Dehydrogenase ,Estimator ,Variance (accounting) ,Odds ratio ,Genetics, Population ,Efficiency ,Case-Control Studies ,Mouth Neoplasms - Abstract
Association analysis of genetic polymorphisms has been mostly performed in a case-control setting in connection with the traditional logistic regression analysis. However, in a case-control study, subjects are recruited according to their disease status and their past exposures are determined. Thus the natural model for making inference is the retrospective model. In this paper, we discuss some retrospective models and give maximum likelihood estimators of exposure effects and estimators of asymptotic variances, when the frequency distribution of exposures in controls contains information about the parameters of interest. Two situations about the control population are considered in this paper: (a) the control population or its subpopulations are in Hardy-Weinberg equilibrium; and (b) genetic and environmental factors are independent in the control population. Using the concept of asymptotic relative efficiency, we shall show the precision advantages of such retrospective analysis over the traditional prospective analysis. Maximum likelihood estimates and variance estimates under retrospective models are simple in computation and thus can be applied in many practical applications. We present one real example to illustrate our methods.
- Published
- 2005
- Full Text
- View/download PDF
45. The use of the triangular test with response-adaptive treatment allocation
- Author
-
Anastasia Ivanova and D. Stephen Coad
- Subjects
Statistics and Probability ,Clinical Trials as Topic ,Likelihood Functions ,Mathematical optimization ,Models, Statistical ,Endpoint Determination ,Epidemiology ,Work (physics) ,Stopping rule ,Binary number ,HIV Infections ,Test (assessment) ,Power (physics) ,Variable (computer science) ,Sample size determination ,Adaptive design ,Statistics ,Humans ,Reverse Transcriptase Inhibitors ,Monte Carlo Method ,Zidovudine ,Mathematics - Abstract
A clinical trial is considered in which two treatments with binary responses are to be compared. A popular sequential stopping rule, the triangular test, is studied when various response-adaptive treatment allocation rules are applied, such as the recently proposed drop-the-loser rule, an urn randomization scheme. The paper extends previous work by Coad and Rosenberger, who combined the triangular test with the randomized play-the-winner rule. The purpose of the paper is to investigate to what extent the variability of an adaptive design affects the overall performance of the triangular test. The adaptive rules under consideration are described and some of their asymptotic properties are summarized. Simulation is then used to assess the performance of the triangular test when combined with the various adaptive rules. The main finding is that the drop-the-loser rule is the most promising of the adaptive rules considered in terms of a less variable allocation proportion and a smaller number of treatment failures. The use of this rule with the triangular test is beneficial compared with the triangular test with equal allocation, since it yields fewer treatment failures on average while providing comparable power with similar expected sample size. The results of an AIDS trial are used to illustrate the performance of the triangular test when combined with the drop-the-loser rule.
- Published
- 2005
- Full Text
- View/download PDF
46. Interval estimation of the proportion ratio under multiple matching
- Author
-
Kung-Jong Lui
- Subjects
Risk ,Statistics and Probability ,Likelihood Functions ,Biometry ,Models, Statistical ,Epidemiology ,Interval estimation ,Coverage probability ,Ratio estimator ,Estimator ,Ascorbic Acid ,Confidence interval ,Efficient estimator ,Neoplasms ,Consistent estimator ,Statistics ,Confidence Intervals ,Odds Ratio ,Humans ,Controlled Clinical Trials as Topic ,Tolerance interval ,Monte Carlo Method ,Probability ,Proportional Hazards Models ,Mathematics - Abstract
The discussions on interval estimation of the proportion ratio (PR) of responses or the relative risk (RR) of a disease for multiple matching have been generally focused on the odds ratio (OR) based on the assumption that the latter can approximate the former well. When the underlying proportion of outcomes is not rare, however, the results for the OR would be inadequate for use if the PR or RR was the parameter of our interest. In this paper, we develop five asymptotic interval estimators of the common PR (or RR) for multiple matching. To evaluate and compare the finite sample performance of these estimators, we apply Monte Carlo simulation to calculate the coverage probability and the average length of the resulting confidence intervals in a variety of situations. We note that when we have a constant number of matching, the interval estimator using the logarithmic transformation of the Mantel-Haenszel estimator, the interval estimator derived from the quadratic inequality given in this paper, and the interval estimator using the logarithmic transformation of the ratio estimator can consistently perform well. When the number of matching varies between matched sets, we find that the interval estimator using the logarithmic transformation of the ratio estimator is probably the best among the five interval estimators considered here in the case of a small number (=20) of matched sets. To illustrate the use of these interval estimators, we employ the data studying the supplemental ascorbate in the supportive treatment of terminal cancer patients.
- Published
- 2005
- Full Text
- View/download PDF
47. Additive and multiplicative covariate regression models for relative survival incorporating fractional polynomials for time-dependent effects
- Author
-
Johannes L. Botha, David R. Jones, Paul C. Lambert, and Lucy K Smith
- Subjects
Statistics and Probability ,Generalized linear model ,education.field_of_study ,Models, Statistical ,Time Factors ,Wales ,Relative survival ,Epidemiology ,Mortality rate ,Population ,Breast Neoplasms ,Regression analysis ,Absolute difference ,Survival Analysis ,England ,Covariate ,Statistics ,Econometrics ,Humans ,Regression Analysis ,Female ,education ,Survival analysis ,Mathematics - Abstract
Relative survival is used to estimate patient survival excluding causes of death not related to the disease of interest. Rather than using cause of death information from death certificates, which is often poorly recorded, relative survival compares the observed survival to that expected in a matched group from the general population. Models for relative survival can be expressed on the hazard (mortality) rate scale as the sum of two components where the total mortality rate is the sum of the underlying baseline mortality rate and the excess mortality rate due to the disease of interest. Previous models for relative survival have assumed that covariate effects act multiplicatively and have thus provided relative effects of differences between groups using excess mortality rate ratios. In this paper we consider (i) the use of an additive covariate model, which provides estimates of the absolute difference in the excess mortality rate; and (ii) the use of fractional polynomials in relative survival models for the baseline excess mortality rate and time-dependent effects. The approaches are illustrated using data on 115 331 female breast cancer patients diagnosed between 1 January 1986 and 31 December 1990. The use of additive covariate relative survival models can be useful in situations when the excess mortality rate is zero or slightly less than zero and can provide useful information from a public health perspective. The use of fractional polynomials has advantages over the usual piecewise estimation by providing smooth estimates of the baseline excess mortality rate and time-dependent effects for both the multiplicative and additive covariate models. All models presented in this paper can be estimated within a generalized linear models framework and thus can be implemented using standard software.
- Published
- 2005
- Full Text
- View/download PDF
48. Estimation ofk for the poly-k test with application to animal carcinogenicity studies
- Author
-
J. Jack Lee, Hojin Moon, Ralph L. Kodell, and Hongshik Ahn
- Subjects
Statistics and Probability ,Biometry ,Carcinogenicity Tests ,Epidemiology ,Tumour incidence ,Mice ,Consistency (statistics) ,Robustness (computer science) ,Equating ,Statistics ,Animals ,Mathematics ,Mice, Inbred BALB C ,Dose-Response Relationship, Drug ,Neoplasms, Experimental ,2-Acetylaminofluorene ,Survival Analysis ,Rats, Inbred F344 ,Rats ,Test (assessment) ,Weighting ,Distribution (mathematics) ,Tumour development ,Data Interpretation, Statistical ,Carcinogens ,Female ,Food Deprivation - Abstract
This paper extends the survival-adjusted Cochran-Armitage test in order to achieve improved robustness to a variety of tumour onset distributions. The Cochran-Armitage test is routinely applied for detecting a linear trend in the incidence of a tumour of interest across dose groups. To improve the robustness to the effects of differential mortality across groups, Bailer and Portier introduced the poly-3 test by a survival adjustment using a fractional weighting scheme for subjects not at full risk of tumour development. The performance of the poly-3 test depends on how closely it represents the correct specification of the time-at-risk weight in the data. Bailer and Portier further suggested that this test can be improved by using a general k reflecting the shape of the tumour onset distribution. In this paper, we propose a method to estimate k by equating the empirical lifetime tumour incidence rate obtained from the data based on the fractional weighting scheme to a separately estimated cumulative lifetime tumour incidence rate. This poly-k test with the statistically estimated k appears to perform better than the poly-3 test which is conducted without prior knowledge of the tumour onset distribution. Our simulation shows that the proposed method improves the robustness to various tumour onset distributions in addition to the robustness to the effects of mortality achieved by the poly-3 test. Large sample properties are shown via simulations to illustrate the consistency of the proposed method. The proposed methods are applied to analyse two real data sets. One is to find a dose-related linear trend on animal carcinogenicity, and the other is to test an effect of calorie restriction on experimental animals.
- Published
- 2003
- Full Text
- View/download PDF
49. Goodness-of-fit processes for logistic regression: simulation results
- Author
-
Nils Lid Hjort and David W. Hosmer
- Subjects
Statistics and Probability ,Time Factors ,Substance-Related Disorders ,Epidemiology ,Statistics as Topic ,HIV Infections ,Context (language use) ,Statistical model ,Sample (statistics) ,Logistic regression ,Logistic Models ,Goodness of fit ,Sample size determination ,Covariate ,Statistics ,Chi-square test ,Econometrics ,Humans ,Computer Simulation ,Residential Treatment ,Randomized Controlled Trials as Topic ,Mathematics - Abstract
In this paper we use simulations to compare the performance of new goodness-of-fit tests based on weighted statistical processes to three currently available tests: the Hosmer-Lemeshow decile-of-risk test; the Pearson chi-square, and the unweighted sum-of-squares tests. The simulations demonstrate that all tests have the correct size. The power for all tests to detect lack-of-fit due to an omitted quadratic term with a sample of size 100 is close to or exceeds 50 per cent to detect moderate departures from linearity and is over 90 per cent for these same alternatives for sample size 500. All tests have low power with sample size 100 to detect lack-of-fit due to an omitted interaction between a dichotomous and continuous covariate, while the power exceeds 80 per cent to detect extreme interaction with a sample size of 500. The power is low to detect any alternative link function with sample size 100 and for most alternative links for sample size 500. Only in the case of sample size 500 and an extremely asymmetric link function is the power over 80 per cent. The results from these simulations show that no single test, new or current, performs best in detecting lack-of-fit due to an omitted covariate or incorrect link function. However, one of the new weighted tests has power comparable to other tests in all settings simulated and had the highest power in the difficult case of an omitted interaction term. We illustrate the tests within the context of a model for factors associated with abstinence from drug use in a randomized trial of residential treatment programmes. We conclude the paper with a summary and specific recommendations for practice.
- Published
- 2002
- Full Text
- View/download PDF
50. An approximate unconditional test of non-inferiority between two proportions
- Author
-
James J. Chen and Seung Ho Kang
- Subjects
Statistics and Probability ,Nominal size ,Binomial distribution ,Exact test ,Epidemiology ,Sample size determination ,Statistics ,Nuisance parameter ,Mathematics ,Nominal level ,Type I and type II errors ,Statistical hypothesis testing - Abstract
This paper investigates an approximate unconditional test for non-inferiority between two independent binomial proportions. The P-value of the approximate unconditional test is evaluated using the maximum likelihood estimate of the nuisance parameter. In this paper, we clarify some differences in defining the rejection regions between the approximate unconditional and conventional conditional or unconditional exact test. We compare the approximate unconditional test with the asymptotic test and unconditional exact test by Chan (Statistics in Medicine, 17, 1403-1413, 1998) with respect to the type I error and power. In general, the type I errors and powers are in the decreasing order of the asymptotic, approximate unconditional and unconditional exact tests. In many cases, the type I errors are above the nominal level from the asymptotic test, and are below the nominal level from the unconditional exact test. In summary, when the non-inferiority test is formulated in terms of the difference between two proportions, the approximate unconditional test is the most desirable, because it is easier to implement and generally more powerful than the unconditional exact test and its size rarely exceeds the nominal size. However, when a test between two proportions is formulated in terms of the ratio of two proportions, such as a test of efficacy, more caution should be made in selecting a test procedure. The performance of the tests depends on the sample size and the range of plausible values of the nuisance parameter. Published in 2000 by John Wiley & Sons, Ltd.
- Published
- 2000
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.