11 results on '"Sebastien, Haneuse"'
Search Results
2. Optimal allocation in stratified cluster‐based outcome‐dependent sampling designs
- Author
-
Bethany Hedt-Gauthier, Sebastien Haneuse, and Sara M. Sauer
- Subjects
Statistics and Probability ,Mathematical optimization ,Epidemiology ,Computer science ,Population ,Sample (statistics) ,01 natural sciences ,Article ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Sampling design ,Covariate ,Cluster Analysis ,Humans ,Computer Simulation ,030212 general & internal medicine ,0101 mathematics ,education ,Generalized estimating equation ,education.field_of_study ,Data Collection ,Sampling (statistics) ,Simple random sample ,Outcome (probability) ,Research Design - Abstract
In public health research, finite resources often require that decisions be made at the study design stage regarding which individuals to sample for detailed data collection. At the same time, when study units are naturally clustered, as patients are in clinics, it may be preferable to sample clusters rather than the study units, especially when the costs associated with travel between clusters are high. In this setting, aggregated data on the outcome and select covariates are sometimes routinely available through, for example, a country's Health Management Information System. If used wisely, this information can be used to guide decisions regarding which clusters to sample, and potentially obtain gains in efficiency over simple random sampling. In this article, we derive a series of formulas for optimal allocation of resources when a single-stage stratified cluster-based outcome-dependent sampling design is to be used and a marginal mean model is specified to answer the question of interest. Specifically, we consider two settings: (i) when a particular parameter in the mean model is of primary interest; and, (ii) when multiple parameters are of interest. We investigate the finite population performance of the optimal allocation framework through a comprehensive simulation study. Our results show that there are trade-offs that must be considered at the design stage: optimizing for one parameter yields efficiency gains over balanced and simple random sampling, while resulting in losses for the other parameters in the model. Optimizing for all parameters simultaneously yields smaller gains in efficiency, but mitigates the losses for the other parameters in the model.
- Published
- 2021
- Full Text
- View/download PDF
3. Model-assisted analyses of longitudinal, ordinal outcomes with absorbing states
- Author
-
Jonathan S. Schildcrout, Frank E. Harrell, Patrick J. Heagerty, Sebastien Haneuse, Chiara Di Gravio, Shawn P. Garbett, Paul J. Rathouz, and Bryan E. Shepherd
- Subjects
Statistics and Probability ,Likelihood Functions ,Biometry ,Epidemiology ,Critical Illness ,Odds Ratio ,Humans ,Longitudinal Studies ,Article - Abstract
Studies of critically ill, hospitalized patients often follow participants and characterize daily health status using an ordinal outcome variable. Statistically, longitudinal proportional odds models are a natural choice in these settings since such models can parsimoniously summarize differences across patient groups and over time. However, when one or more of the outcome states is absorbing, the proportional odds assumption for the follow-up time parameter will likely be violated, and more flexible longitudinal models are needed. Motivated by the VIOLET Study(1), a parallel-arm, randomized clinical trial of Vitamin D(3) in critically ill patients, we discuss and contrast several treatment effect estimands based on time-dependent odds ratio parameters, and we detail contemporary modeling approaches. In VIOLET, the outcome is a four-level ordinal variable where the lowest ‘not alive’ state is absorbing and the highest ‘at-home’ state is nearly absorbing. We discuss flexible extensions of the proportional odds model for longitudinal data that can be used for either model-based inference, where the odds ratio estimator is taken directly from the model fit, or for model-assisted inferences, where heterogeneity across cumulative log odds dichotomizations is modeled and results are summarized to obtain an overall odds ratio estimator. We focus on direct estimation of cumulative probability model parameters using likelihood-based analysis procedures that naturally handle absorbing states. We illustrate the modeling procedures, the relative precision of model-based and model-assisted estimators, and the possible differences in the values for which the estimators are consistent through simulations and analysis of the VIOLET Study data.
- Published
- 2022
4. Two-wave two-phase outcome-dependent sampling designs, with applications to longitudinal binary data
- Author
-
Jacob M. Maronge, Paul J. Rathouz, Ran Tao, Jonathan S. Schildcrout, Nathaniel D. Mercaldo, Patrick J. Heagerty, and Sebastien Haneuse
- Subjects
Statistics and Probability ,Optimal design ,Models, Statistical ,Epidemiology ,Computer science ,Sampling (statistics) ,Marginal model ,01 natural sciences ,Outcome (probability) ,Article ,Cohort Studies ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Sample size determination ,Research Design ,Sample Size ,Statistics ,Binary data ,Covariate ,Humans ,030212 general & internal medicine ,Longitudinal Studies ,0101 mathematics ,Event (probability theory) - Abstract
Two-phase outcome-dependent sampling (ODS) designs are useful when resource constraints prohibit expensive exposure ascertainment on all study subjects. One class of ODS designs for longitudinal binary data stratifies subjects into three strata according to those who experience the event at none, some, or all follow-up times. For time-varying covariate effects, exclusively selecting subjects with response variation can yield highly efficient estimates. However, if interest lies in the association of a time-invariant covariate, or the joint associations of time-varying and time-invariant covariates with the outcome, then the optimal design is unknown. Therefore, we propose a class of two-wave two-phase ODS designs for longitudinal binary data. We split the second-phase sample selection into two waves, between which an interim design evaluation analysis is conducted. The interim design evaluation analysis uses first-wave data to conduct a simulation-based search for the optimal second-wave design that will improve the likelihood of study success. Although we focus on longitudinal binary response data, the proposed design is general and can be applied to other response distributions. We believe that the proposed designs can be useful in settings where (1) the expected second-phase sample size is fixed and one must tailor stratum-specific sampling probabilities to maximize estimation efficiency, or (2) relative sampling probabilities are fixed across sampling strata and one must tailor sample size to achieve a desired precision. We describe the class of designs, examine finite sampling operating characteristics, and apply the designs to an exemplar longitudinal cohort study, the Lung Health Study.
- Published
- 2020
5. On the analysis of two-phase designs in cluster-correlated data settings
- Author
-
Claudia Rivera-Rodriguez, Sebastien Haneuse, and Donna Spiegelman
- Subjects
Statistics and Probability ,Malawi ,National Health Programs ,Epidemiology ,Computer science ,Population ,Inference ,HIV Infections ,computer.software_genre ,01 natural sciences ,Article ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Risk Factors ,Covariate ,Sampling design ,Cluster Analysis ,Humans ,Computer Simulation ,030212 general & internal medicine ,0101 mathematics ,education ,Cluster analysis ,Categorical variable ,education.field_of_study ,Clinical Trials as Topic ,Models, Statistical ,Inverse probability weighting ,Weighting ,Anti-Retroviral Agents ,Research Design ,Data mining ,computer - Abstract
In public health research, information that is readily available may be insufficient to address the primary question(s) of interest. One cost-efficient way forward, especially in resource-limited settings, is to conduct a two-phase study in which the population is initially stratified, at phase I, by the outcome and/or some categorical risk factor(s). At phase II detailed covariate data is ascertained on a subsample within each phase I strata. While analysis methods for two-phase designs are well established, they have focused exclusively on settings in which participants are assumed to be independent. As such, when participants are naturally clustered (eg, patients within clinics) these methods may yield invalid inference. To address this, we develop a novel analysis approach based on inverse-probability weighting that permits researchers to specify some working covariance structure and appropriately accounts for the sampling design and ensures valid inference via a robust sandwich estimator for which a closed-form expression is provided. To enhance statistical efficiency, we propose a calibrated inverse-probability weighting estimator that makes use of information available at phase I but not used in the design. In addition to describing the technique, practical guidance is provided for the cluster-correlated data settings that we consider. A comprehensive simulation study is conducted to evaluate small-sample operating characteristics, including the impact of using naive methods that ignore correlation due to clustering, as well as to investigate design considerations. Finally, the methods are illustrated using data from a one-time survey of the national antiretroviral treatment program in Malawi.
- Published
- 2018
6. Analyses of longitudinal, hospital clinical laboratory data with application to blood glucose concentrations
- Author
-
Jonathan S. Schildcrout, Josh F. Peterson, Joshua C. Denny, Sebastien Haneuse, Randolph A. Miller, Michael E. Matheny, and Lemuel R. Waitman
- Subjects
Statistics and Probability ,medicine.medical_specialty ,Epidemiology ,business.industry ,Clinical study design ,Confounding ,Intensive care unit ,Health informatics ,law.invention ,Insulin infusion ,law ,Causal inference ,Intensive care ,Statistics ,Medicine ,Model choice ,business ,Intensive care medicine - Abstract
Electronic medical record (EMR) systems afford researchers with opportunities to investigate a broad range of scientific questions. In contrast to purposeful study designs, however, EMR data acquisition procedures typically do not align with any specific hypothesis. Subsequent investigations therefore require detailed characterization of clinical procedures and protocols that underlie EMR data, as well as careful consideration of model choice. For example, many intensive care units currently implement insulin infusion protocols to better control patients’ blood glucose levels. The protocols use prior glucose levels to determine, in part, how to adjust the infusion rate. Such feedback loops introduce time-dependent confounding into longitudinal analyses even though they may not always be evident to the analyst. In this paper, we review commonly used longitudinal model specifications and interpretations and show how these are particularly important in the presence of hospital-based clinical protocols. We show that parameter relationships among various models can be used to identify and characterize the impact of time-dependent confounding and therefore help explain seemingly incongruous conclusions. We also review important estimation challenges in the presence of time-dependent confounding and show how certain model specifications may be more or less susceptible to bias. To illustrate these points, we present a detailed analysis of the relationship between blood glucose levels and insulin doses on the basis of data from an intensive care unit.
- Published
- 2011
- Full Text
- View/download PDF
7. Sensitivity analyses for unmeasured confounding assuming a marginal structural model for repeated measures
- Author
-
Babette Brumback, James M. Robins, Sebastien Haneuse, and Miguel A. Hernán
- Subjects
Male ,Statistics and Probability ,Time Factors ,Epidemiology ,Marginal structural model ,HIV Infections ,Sensitivity and Specificity ,Consistency (statistics) ,Statistics ,Confidence Intervals ,Econometrics ,Humans ,Homosexuality, Male ,Monitoring, Physiologic ,Mathematics ,Clinical Trials as Topic ,Models, Statistical ,Confounding ,Estimator ,Repeated measures design ,Confounding Factors, Epidemiologic ,Causality ,United States ,Confidence interval ,CD4 Lymphocyte Count ,Inverse probability ,Zidovudine - Abstract
Robins introduced marginal structural models (MSMs) and inverse probability of treatment weighted (IPTW) estimators for the causal effect of a time-varying treatment on the mean of repeated measures. We investigate the sensitivity of IPTW estimators to unmeasured confounding. We examine a new framework for sensitivity analyses based on a nonidentifiable model that quantifies unmeasured confounding in terms of a sensitivity parameter and a user-specified function. We present augmented IPTW estimators of MSM parameters and prove their consistency for the causal effect of an MSM, assuming a correct confounding bias function for unmeasured confounding. We apply the methods to assess sensitivity of the analysis of Hernán et al., who used an MSM to estimate the causal effect of zidovudine therapy on repeated CD4 counts among HIV-infected men in the Multicenter AIDS Cohort Study. Under the assumption of no unmeasured confounders, a 95 per cent confidence interval for the treatment effect includes zero. We show that under the assumption of a moderate amount of unmeasured confounding, a 95 per cent confidence interval for the treatment effect no longer includes zero. Thus, the analysis of Hernán et al. is somewhat sensitive to unmeasured confounding. We hope that our research will encourage and facilitate analyses of sensitivity to unmeasured confounding in other applications.
- Published
- 2004
- Full Text
- View/download PDF
8. Bayes computation for ecological inference
- Author
-
Sebastien Haneuse, Adrian Dobra, Jon Wakefield, and Elizabeth Teeple
- Subjects
Statistics and Probability ,Epidemiology ,Computer science ,Bayesian probability ,Inference ,computer.software_genre ,Article ,symbols.namesake ,Bayes' theorem ,Diabetes Mellitus ,Humans ,Computer Simulation ,Models, Statistical ,Markov chain ,Sampling (statistics) ,Markov chain Monte Carlo ,Bayes Theorem ,Numerical Analysis, Computer-Assisted ,Missing data ,Simple random sample ,Markov Chains ,Data Interpretation, Statistical ,symbols ,Female ,Data mining ,Epidemiologic Methods ,computer ,Monte Carlo Method - Abstract
Ecological data are available at the level of the group, rather than at the level of the individual. The use of ecological data in spatial epidemiological investigations is particularly common. Although the computational methods described are more generally applicable, this paper concentrates on the situation in which the margins of 2 × 2 tables are observed in each of n geographical areas, with a Bayesian approach to inference. We consider auxiliary schemes that impute the missing data, and compare with a previously suggested normal approximation. The analysis of ecological data is subject to ecological bias, with the only reliable means of removing such bias being the addition of auxiliary individual-level information. Various schemes have been suggested for this supplementation, and we illustrate how the computational methods may be applied to the analysis of such enhanced data. The methods are illustrated using simulated data and two examples. In the first example, the ecological data are supplemented with a simple random sample of individual-level data, and in this example the normal approximation fails. In the second example case–control sampling provides the additional information. Copyright © 2011 John Wiley & Sons, Ltd.
- Published
- 2010
9. Analyses of longitudinal, hospital clinical laboratory data with application to blood glucose concentrations
- Author
-
Jonathan S, Schildcrout, Sebastien, Haneuse, Josh F, Peterson, Joshua C, Denny, Michael E, Matheny, Lemuel R, Waitman, and Randolph A, Miller
- Subjects
Adult ,Aged, 80 and over ,Blood Glucose ,Male ,Models, Statistical ,Middle Aged ,Article ,Intensive Care Units ,Young Adult ,Clinical Protocols ,Humans ,Insulin ,Female ,Longitudinal Studies ,Aged - Abstract
Electronic medical record (EMR) systems afford researchers with opportunities to investigate a broad range of scientific questions. In contrast to purposeful study designs, however, EMR data acquisition procedures typically do not align with any specific hypothesis. Subsequent investigations therefore require detailed characterization of clinical procedures and protocols that underlie EMR data, as well as careful consideration of model choice. For example, many intensive care units currently implement insulin infusion protocols to better control patients’ blood glucose levels. The protocols use prior glucose levels to determine, in part, how to adjust the infusion rate. Such feedback loops introduce time-dependent confounding into longitudinal analyses even though they may not always be evident to the analyst. In this paper, we review commonly used longitudinal model specifications and interpretations and show how these are particularly important in the presence of hospital-based clinical protocols. We show that parameter relationships among various models can be used to identify and characterize the impact of time-dependent confounding and therefore help explain seemingly incongruous conclusions. We also review important estimation challenges in the presence of time-dependent confounding and show how certain model specifications may be more or less susceptible to bias. To illustrate these points, we present a detailed analysis of the relationship between blood glucose levels and insulin doses on the basis of data from an intensive care unit.
- Published
- 2009
10. Geographic-based ecological correlation studies using supplemental case-control data
- Author
-
Jon Wakefield and Sebastien Haneuse
- Subjects
Statistics and Probability ,Male ,Lung Neoplasms ,Operations research ,Epidemiology ,Inference ,Poison control ,Sample (statistics) ,Context (language use) ,Medicine ,Humans ,Ecological fallacy ,Ohio ,Likelihood Functions ,Ecology ,business.industry ,Ecological study ,Bayes Theorem ,Data science ,Research Design ,Case-Control Studies ,Small-Area Analysis ,Aggregate data ,Female ,business ,Environmental Pollution ,Ecological correlation - Abstract
It is well known that the ecological study design suffers from a variety of biases that render the interpretation of its results difficult. Despite its limitations, however, the ecological study design is still widely used in a range of disciplines. The only solution to the ecological inference problem is to supplement the aggregate data with individual-level data and, to this end, Haneuse and Wakefield (Biometrics 2007; 63:128–136) recently proposed a hybrid study design in which an ecological study is supplemented with a sample of case–control data. The latter provides the basis for the control of bias, while the former may provide efficiency gains. Building on that work, we illustrate the use of the hybrid design in the context of a geographical correlation study of lung cancer mortality from the state of Ohio. Focusing on epidemiological applications, we initially provide an overview of the use of ecological studies in scientific research, highlighting the breadth of current application as well as advantages and drawbacks of the design. We consider the interplay between the two sources of information in the design: ecological and case–control, and then provide details on a Bayesian spatial random effects model in the setting of the hybrid design. Issues of specification are addressed, as well as sensitivity to modeling assumptions. Further, an interesting feature of these data is that they provide an example of how the proposed design may be used to resolve the ecological fallacy. Copyright © 2007 John Wiley & Sons, Ltd.
- Published
- 2007
11. The interpretation of exposure effect estimates in chronic air pollution studies
- Author
-
Jon Wakefield, Lianne Sheppard, and Sebastien Haneuse
- Subjects
Statistics and Probability ,Chronic exposure ,Future studies ,Epidemiology ,Computer science ,Proportional hazards model ,Air pollution exposure ,Interpretation (philosophy) ,Research ,Confounding ,Air pollution ,Environmental Exposure ,medicine.disease_cause ,Regression ,United States ,Air Pollution ,medicine ,Econometrics ,Proportional Hazards Models - Abstract
In this article we consider the interpretation of regression parameters used to represent 'chronic' or 'long-term' air pollution exposure effects. Although scientific interest typically lies in understanding such effects at the level of the individual, studies have generally employed a semi-ecological design; outcomes and confounder information are collected on individuals while exposure is only available at the aggregate-or group-level. A precise interpretation of results from a semi-ecological design must take into account the aggregated nature, both spatial and temporal, of the exposure measure. The most common analysis approach for assessing chronic exposure effects has been within the Cox proportional hazards model framework; specific analyses are tailored to accommodate the shortcomings of the available exposure information. We revisit the underlying assumptions of the Cox model and discuss the implications of two common aspects of chronic effects studies: time-dependent exposures and time-varying effects. Focusing on the consequences of temporal aggregation of exposure, we show that an estimate obtained from a time-aggregated semi-ecological design can correspond to very different underlying time-varying exposure and risk scenarios. Further, distinguishing which of these is correct is not possible from the semi-ecological data alone. Our goal is to highlight some statistical issues faced by existing studies of chronic air pollution effects, and aid in the development and planning of future studies.
- Published
- 2007
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.