235 results on '"Bayesian bootstrap"'
Search Results
2. Bayesian doubly robust estimation of causal effects for clustered observational data.
- Author
-
Zhou, Qi, He, Haonan, Zhao, Jie, and Song, Joon Jin
- Subjects
- *
MULTILEVEL models , *PROBABILITY theory ,CARDIOVASCULAR disease related mortality - Abstract
Observational data often exhibit clustered structure, which leads to inaccurate estimates of exposure effect if such structure is ignored. To overcome the challenges of modelling the complex confounder effects in clustered data, we propose a Bayesian doubly robust estimator of causal effects with random intercept BART to enhance the robustness against model misspecification. The proposed approach incorporates the uncertainty in the estimation of the propensity score, potential outcomes and the distribution of individual-level and cluster-level confounders into the exposure effect estimation, thereby improving the coverage probability of interval estimation. We evaluate the proposed method in the simulation study compared with frequentist doubly robust estimators with parametric and nonparametric multilevel modelling strategies. The proposed method is applied to estimate the effect of limited food access on the mortality of cardiovascular disease in the senior population. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
3. Empirical Inferences Under Bayesian Framework to Identify Cellwise Outliers.
- Author
-
Sartore, Luca, Chen, Lu, and Bejleri, Valbona
- Subjects
DETECTION algorithms ,OUTLIER detection ,FUZZY logic ,CROP yields ,BAYESIAN field theory - Abstract
Outliers are typically identified using frequentist methods. The data are classified as "outliers" or "not outliers" based on a test statistic that measures the magnitude of the difference between a value and the majority part of the data. The threshold for a data value to be an outlier is typically defined by the user. However, a subjective choice of the threshold increases the uncertainty associated with outlier status for each data value. A cellwise outlier detection algorithm named FuzzyHRT is used to automate the editing process in repeated surveys. This algorithm uses Bienaymé–Chebyshev's inequality and fuzzy logic to detect four different types of outliers resulting from format inconsistencies, historical, tail, and relational anomalies. However, fuzzy logic is not suited for probabilistic reasoning behind the identification of anomalous cells. Bayesian methods are well suited for quantifying the uncertainty associated with the identification of outliers. Although, as suggested by the literature, there exist well-developed Bayesian methods for record-level outlier detection, Bayesian methods for identifying outliers within individual records (i.e., at the cell level) remain unexplored. This paper presents two approaches from the Bayesian perspective to study the uncertainty associated with identifying outliers. A Bayesian bootstrap approach is explored to study the uncertainty associated with the output scores from the FuzzyHRT algorithm. Empirical likelihoods in a Bayesian setting are also considered for probabilistic reasoning behind the identification of anomalous cells. NASS survey data for livestock and major crop yield (such as corn) are considered for comparing the performances of the two proposed approaches with recent cellwise outlier methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Empirical Inferences Under Bayesian Framework to Identify Cellwise Outliers
- Author
-
Luca Sartore, Lu Chen, and Valbona Bejleri
- Subjects
anomaly identification ,Bayesian bootstrap ,empirical likelihood ,fuzzy logic ,predictive distribution ,uncertainty ,Statistics ,HA1-4737 - Abstract
Outliers are typically identified using frequentist methods. The data are classified as “outliers” or “not outliers” based on a test statistic that measures the magnitude of the difference between a value and the majority part of the data. The threshold for a data value to be an outlier is typically defined by the user. However, a subjective choice of the threshold increases the uncertainty associated with outlier status for each data value. A cellwise outlier detection algorithm named FuzzyHRT is used to automate the editing process in repeated surveys. This algorithm uses Bienaymé–Chebyshev’s inequality and fuzzy logic to detect four different types of outliers resulting from format inconsistencies, historical, tail, and relational anomalies. However, fuzzy logic is not suited for probabilistic reasoning behind the identification of anomalous cells. Bayesian methods are well suited for quantifying the uncertainty associated with the identification of outliers. Although, as suggested by the literature, there exist well-developed Bayesian methods for record-level outlier detection, Bayesian methods for identifying outliers within individual records (i.e., at the cell level) remain unexplored. This paper presents two approaches from the Bayesian perspective to study the uncertainty associated with identifying outliers. A Bayesian bootstrap approach is explored to study the uncertainty associated with the output scores from the FuzzyHRT algorithm. Empirical likelihoods in a Bayesian setting are also considered for probabilistic reasoning behind the identification of anomalous cells. NASS survey data for livestock and major crop yield (such as corn) are considered for comparing the performances of the two proposed approaches with recent cellwise outlier methods.
- Published
- 2024
- Full Text
- View/download PDF
5. A Bayesian bootstrap-Copula coupled method for slope reliability analysis considering bivariate distribution of shear strength parameters.
- Author
-
Yao, Wenmin, Fan, Yibo, Li, Changdong, Zhan, Hongbin, Zhang, Xin, Lv, Yiming, and Du, Zibo
- Subjects
- *
MONTE Carlo method , *SLOPES (Soil mechanics) , *SLOPE stability , *SHEAR strength , *SAFETY factor in engineering - Abstract
Estimation of the uncertainties of geotechnical parameters is a fundamental task in slope reliability analysis, and it becomes more challenging when only limited data on geotechnical parameters are available. In this study, a Bayesian bootstrap-Copula coupled method is proposed for slope reliability analysis based on limited geotechnical data. Specifically, the bivariate distribution of shear strength parameters ((cohesion (c) and friction angle (φ)) can be evaluated using the Bayesian bootstrap method and Copula theory, and then the slope reliability can be calculated using Monte Carlo simulation (MCS). Application to a homogeneous, undrained cohesive slope demonstrates the accuracy and effectiveness of the proposed approach. The results indicate that even when data are limited, the bivariate distribution of c and φ can be determined and the estimated statistics met the predefined ones well. Satisfying estimations of the mean ( μ FS ) and standard deviation ( σ FS ) of the factor of safety (FS) and failure probability (Pf) can be obtained. The representative sliding surface with the greatest likelihood is found to approximate the critical one based on the mean values of c and φ. Besides, when the available sample size of geotechnical data exceeds a threshold, stable estimations of the bivariate distribution of c and φ can be obtained, and the estimated μ FS is stable, while the values of σ FS and Pf synchronously vary within a small range. Applications of the proposed Bayesian bootstrap-Copula coupled method indicate that it can be used to estimate the bivariate distribution of correlated parameters and the reliability of various geotechnical problems based on limited data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Familial inference: tests for hypotheses on a family of centres.
- Author
-
Thompson, Ryan, Forbes, Catherine S, MacEachern, Steven N, and Peruggia, Mario
- Subjects
- *
HYPOTHESIS , *FAMILIES , *PSYCHOLOGY , *POSSIBILITY , *CRISES - Abstract
Statistical hypotheses are translations of scientific hypotheses into statements about one or more distributions, often concerning their centre. Tests that assess statistical hypotheses of centre implicitly assume a specific centre, e.g. the mean or median. Yet, scientific hypotheses do not always specify a particular centre. This ambiguity leaves the possibility for a gap between scientific theory and statistical practice that can lead to rejection of a true null. In the face of replicability crises in many scientific disciplines, significant results of this kind are concerning. Rather than testing a single centre, this paper proposes testing a family of plausible centres, such as that induced by the Huber loss function. Each centre in the family generates a testing problem, and the resulting family of hypotheses constitutes a familial hypothesis. A Bayesian nonparametric procedure is devised to test familial hypotheses, enabled by a novel pathwise optimization routine to fit the Huber family. The favourable properties of the new test are demonstrated theoretically and experimentally. Two examples from psychology serve as real-world case studies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. A Propensity-Score Integrated Approach to Bayesian Dynamic Power Prior Borrowing.
- Author
-
Jixian Wang, Hongtao Zhang, and Tiwari, Ram
- Subjects
- *
ACUTE myeloid leukemia , *INTERNAL auditing - Abstract
Use of historical control data to augment a small internal control arm in a randomized control trial (RCT) can lead to significant improvement of the efficiency of the trial. It introduces the risk of potential bias, since the historical control population is often rather different from the RCT. Power prior approaches have been introduced to discount the historical data to mitigate the impact of the population difference. However, even with a Bayesian dynamic borrowing which can discount the historical data based on the outcome similarity of the two populations, a considerable population difference may still lead to a moderate bias. Hence, a robust adjustment for the population difference using approaches such as the inverse probability weighting ormatching, canmake the borrowingmore efficient and robust. In this article,we propose a novel approach integrating the propensity score for the covariate adjustment and Bayesian dynamic borrowing using power prior. The proposed approach uses Bayesian bootstrap in combination with the empirical Bayes (EB) method using quasi-likelihood for determining the power prior. The performance of our approach is examined by a simulation study. We apply the approach to two Acute Myeloid Leukemia (AML) studies for illustration. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Bayesian Lesion Estimation with a Structured Spike-and-Slab Prior.
- Author
-
Menacher, Anna, Nichols, Thomas E., Holmes, Chris, and Ganjgahi, Habib
- Subjects
- *
BAYESIAN analysis , *HIERARCHICAL Bayes model , *BRAIN damage , *PREDICATE calculus , *WHITE matter (Nerve tissue) , *SAMPLE size (Statistics) , *AGE factors in disease - Abstract
Neural demyelination and brain damage accumulated in white matter appear as hyperintense areas on T2-weighted MRI scans in the form of lesions. Modeling binary images at the population level, where each voxel represents the existence of a lesion, plays an important role in understanding aging and inflammatory diseases. We propose a scalable hierarchical Bayesian spatial model, called BLESS, capable of handling binary responses by placing continuous spike-and-slab mixture priors on spatially varying parameters and enforcing spatial dependency on the parameter dictating the amount of sparsity within the probability of inclusion. The use of mean-field variational inference with dynamic posterior exploration, which is an annealing-like strategy that improves optimization, allows our method to scale to large sample sizes. Our method also accounts for underestimation of posterior variance due to variational inference by providing an approximate posterior sampling approach based on Bayesian bootstrap ideas and spike-and-slab priors with random shrinkage targets. Besides accurate uncertainty quantification, this approach is capable of producing novel cluster size based imaging statistics, such as credible intervals of cluster size, and measures of reliability of cluster occurrence. Lastly, we validate our results via simulation studies and an application to the UK Biobank, a large-scale lesion mapping study with a sample size of 40,000 subjects. for this article are available online. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Embedded multilevel regression and poststratification: Model‐based inference with incomplete auxiliary information.
- Author
-
Li, Katherine and Si, Yajuan
- Subjects
- *
MARGINAL distributions , *MULTILEVEL models , *DATA distribution , *HEALTH equity , *CELL populations - Abstract
Health disparity research often evaluates health outcomes across demographic subgroups. Multilevel regression and poststratification (MRP) is a popular approach for small subgroup estimation as it can stabilize estimates by fitting multilevel models and adjust for selection bias by poststratifying on auxiliary variables, which are population characteristics predictive of the analytic outcome. However, the granularity and quality of the estimates produced by MRP are limited by the availability of the auxiliary variables' joint distribution; data analysts often only have access to the marginal distributions. To overcome this limitation, we embed the estimation of population cell counts needed for poststratification into the MRP workflow: embedded MRP (EMRP). Under EMRP, we generate synthetic populations of the auxiliary variables before implementing MRP. All sources of estimation uncertainty are propagated with a fully Bayesian framework. Through simulation studies, we compare different methods of generating the synthetic populations and demonstrate EMRP's improvements over alternatives on the bias‐variance tradeoff to yield valid subpopulation inferences of interest. We apply EMRP to the Longitudinal Survey of Wellbeing and estimate food insecurity prevalence among vulnerable groups in New York City. We find that all EMRP estimators can correct for the bias in classical MRP while maintaining lower standard errors and narrower confidence intervals than directly imputing with the weighted finite population Bayesian bootstrap (WFPBB) and design‐based estimates. Performances from the EMRP estimators do not differ substantially from each other, though we would generally recommend using the WFPBB‐MRP for its consistently high coverage rates. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Statistical Challenges for Causal Inference Using Time-to-Event Real-World Data
- Author
-
Wang, Jixian, Zhang, Hongtao, Tiwari, Ram, He, Weili, editor, Fang, Yixin, editor, and Wang, Hongwei, editor
- Published
- 2023
- Full Text
- View/download PDF
11. Bayesian Bootstrap Spike-and-Slab LASSO.
- Author
-
Nie, Lizhen and Ročková, Veronika
- Subjects
- *
RANDOM variables , *SCALABILITY - Abstract
The impracticality of posterior sampling has prevented the widespread adoption of spike-and-slab priors in high-dimensional applications. To alleviate the computational burden, optimization strategies have been proposed that quickly find local posterior modes. Trading off uncertainty quantification for computational speed, these strategies have enabled spike-and-slab deployments at scales that would be previously unfeasible. We build on one recent development in this strand of work: the Spike-and-Slab LASSO procedure. Instead of optimization, however, we explore multiple avenues for posterior sampling, some traditional and some new. Intrigued by the speed of Spike-and-Slab LASSO mode detection, we explore the possibility of sampling from an approximate posterior by performing MAP optimization on many independently perturbed datasets. To this end, we explore Bayesian bootstrap ideas and introduce a new class of jittered Spike-and-Slab LASSO priors with random shrinkage targets. These priors are a key constituent of the Bayesian Bootstrap Spike-and-Slab LASSO (BB-SSL) method proposed here. BB-SSL turns fast optimization into approximate posterior sampling. Beyond its scalability, we show that BB-SSL has a strong theoretical support. Indeed, we find that the induced pseudo-posteriors contract around the truth at a near-optimal rate in sparse normal-means and in high-dimensional regression. We compare our algorithm to the traditional Stochastic Search Variable Selection (under Laplace priors) as well as many state-of-the-art methods for shrinkage priors. We show, both in simulations and on real data, that our method fares very well in these comparisons, often providing substantial computational gains. for this article are available online. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
12. The Weighted Bootstrap
- Author
-
Dickhaus, Thorsten and Dickhaus, Thorsten
- Published
- 2022
- Full Text
- View/download PDF
13. Accurate Confidence and Bayesian Interval Estimation for Non-centrality Parameters and Effect Size Indices.
- Author
-
Kang, Kaidi, Jones, Megan T., Armstrong, Kristan, Avery, Suzanne, McHugo, Maureen, Heckers, Stephan, and Vandekar, Simon
- Subjects
CONFIDENCE intervals ,PARAMETER estimation ,ANALYSIS of variance ,CHI-square distribution - Abstract
Reporting effect size index estimates with their confidence intervals (CIs) can be an excellent way to simultaneously communicate the strength and precision of the observed evidence. We recently proposed a robust effect size index (RESI) that is advantageous over common indices because it's widely applicable to different types of data. Here, we use statistical theory and simulations to develop and evaluate RESI estimators and confidence/credible intervals that rely on different covariance estimators. Our results show (1) counter to intuition, the randomness of covariates reduces coverage for Chi-squared and F CIs; (2) when the variance of the estimators is estimated, the non-central Chi-squared and F CIs using the parametric and robust RESI estimators fail to cover the true effect size at the nominal level. Using the robust estimator along with the proposed nonparametric bootstrap or Bayesian (credible) intervals provides valid inference for the RESI, even when model assumptions may be violated. This work forms a unified effect size reporting procedure, such that effect sizes with confidence/credible intervals can be easily reported in an analysis of variance (ANOVA) table format. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. Bayesian Bootstrapped Correlation Coefficients
- Author
-
Rodriguez, Josue E. and Williams, Donald R.
- Subjects
bayesian bootstrap ,correlation ,ordinal ,credible interval ,Psychology ,BF1-990 - Abstract
We propose the Bayesian bootstrap (BB) as a generic, simple, and accessible method for sampling from the posterior distribution of various correlation coefficients that are commonly used in the social-behavioral sciences. In a series of examples, we demonstrate how the BB can be used to estimate Pearson's, Spearman's, Gaussian rank, Kendall's $\tau $, and polychoric correlations. We also describe an approach based on a region of practical equivalence to evaluate differences and null associations among the estimated correlations. In addition, we have implemented the methodology in the R package BBcor (https://cran.r-project.org/web/packages/BBcor/index.html). Example code and key advantages of the proposed methods are illustrated in an applied example.
- Published
- 2022
- Full Text
- View/download PDF
15. A general Bayesian bootstrap for censored data based on the beta-Stacy process.
- Author
-
Arfè, Andrea and Muliere, Pietro
- Subjects
- *
SURVIVAL rate , *MARKOV chain Monte Carlo , *CENSORING (Statistics) , *BAYESIAN field theory - Abstract
We introduce a novel procedure to perform Bayesian non-parametric inference with right-censored data, the beta-Stacy bootstrap. This approximates the posterior law of summaries of the survival distribution (e.g. the mean survival time). More precisely, our procedure approximates the joint posterior law of functionals of the beta-Stacy process, a non-parametric process prior that generalizes the Dirichlet process and that is widely used in survival analysis. The beta-Stacy bootstrap generalizes and unifies other common Bayesian bootstraps for complete or censored data based on non-parametric priors. It is defined by an exact sampling algorithm that does not require tuning of Markov Chain Monte Carlo steps. We illustrate the beta-Stacy bootstrap by analyzing survival data from a real clinical trial. • The beta-Stacy bootstrap performs Bayesian non-parametric inference with censored data. • It generalizes and unifies other common Bayesian bootstrap algorithms. • Code to implement the beta-Stacy bootstrap is available at https://github.com/andreaarfe/. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. A stochastic Bayesian bootstrapping model for COVID-19 data.
- Author
-
Calatayud, Julia, Jornet, Marc, and Mateu, Jorge
- Subjects
- *
COVID-19 pandemic , *RANDOM variables , *DISTRIBUTION (Probability theory) , *STOCHASTIC models , *DATA modeling , *STATISTICAL bootstrapping - Abstract
We provide a stochastic modeling framework for the incidence of COVID-19 in Castilla-Leon (Spain) for the period March 1, 2020 to February 12, 2021, which encompasses four waves. Each wave is appropriately described by a generalized logistic growth curve. Accordingly, the four waves are modeled through a sum of four generalized logistic growth curves. Pointwise values of the twenty input parameters are fitted by a least-squares optimization procedure. Taking into account the significant variability in the daily reported cases, the input parameters and the errors are regarded as random variables on an abstract probability space. Their probability distributions are inferred from a Bayesian bootstrap procedure. This framework is shown to offer a more accurate estimation of the COVID-19 reported cases than the deterministic formulation. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
17. An Ensemble EM Algorithm for Bayesian Variable Selection.
- Author
-
Jin Wang, Yunbo Ouyang, Yuan Ji, and Feng Liang
- Subjects
BAYESIAN analysis ,REGRESSION analysis ,STATISTICAL bootstrapping ,ASYMPTOTIC distribution ,ITERATIVE methods (Mathematics) - Abstract
We study the Bayesian approach to variable selection for linear regression models. Motivated by a recent work by Ročková and George (2014), we propose an EM algorithm that returns the MAP estimator of the set of relevant variables. Due to its particular updating scheme, our algorithm can be implemented efficiently without inverting a large matrix in each iteration and therefore can scale up with big data. We also have showed that the MAP estimator returned by our EM algorithm achieves variable selection consistency even when p diverges with n. In practice, our algorithm could get stuck with local modes, a common problem with EM algorithms. To address this issue, we propose an ensemble EM algorithm, in which we repeatedly apply our EM algorithm to a subset of the samples with a subset of the covariates, and then aggregate the variable selection results across those bootstrap replicates. Empirical studies have demonstrated the superior performance of the ensemble EM algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
18. Bayesian bootstrap quantile regression for probabilistic photovoltaic power forecasting
- Author
-
Mokhtar Bozorg, Antonio Bracale, Pierluigi Caramia, Guido Carpinelli, Mauro Carpita, and Pasquale De Falco
- Subjects
Bayesian bootstrap ,Photovoltaic systems ,Probabilistic forecasting ,Renewable generation ,smart grids ,Distribution or transmission of electric power ,TK3001-3521 ,Production of electric energy or power. Powerplants. Central stations ,TK1001-1841 - Abstract
Abstract Photovoltaic (PV) systems are widely spread across MV and LV distribution systems and the penetration of PV generation is solidly growing. Because of the uncertain nature of the solar energy resource, PV power forecasting models are crucial in any energy management system for smart distribution networks. Although point forecasts can suit many scopes, probabilistic forecasts add further flexibility to an energy management system and are recommended to enable a wider range of decision making and optimization strategies. This paper proposes methodology towards probabilistic PV power forecasting based on a Bayesian bootstrap quantile regression model, in which a Bayesian bootstrap is applied to estimate the parameters of a quantile regression model. A novel procedure is presented to optimize the extraction of the predictive quantiles from the bootstrapped estimation of the related coefficients, raising the predictive ability of the final forecasts. Numerical experiments based on actual data quantify an enhancement of the performance of up to 2.2% when compared to relevant benchmarks.
- Published
- 2020
- Full Text
- View/download PDF
19. Reflections on statistical modelling: A conversation with Murray Aitkin.
- Author
-
Aitkin, Murray, Hinde, John, and Francis, Brian
- Subjects
- *
STATISTICAL models , *CONVERSATION - Abstract
A virtual interview with Murray Aitkin by Brian Francis and John Hinde, two of the original members of the Centre for Applied Statistics that Murray created at Lancaster University. The talk ranges over Murray's reflections of a career in statistical modelling and the many different collaborations across the world that have been such a significant part of it. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. BAYESIAN INFERENCE ON MULTIVARIATE MEDIANS AND QUANTILES.
- Author
-
Bhattacharya, Indrabati and Ghosal, Subhashis
- Abstract
We consider Bayesian inferences on a type of multivariate median and the multivariate quantile functionals of a joint distribution using a Dirichlet process prior. Unlike univariate quantiles, the exact posterior distribution of multivariate median and multivariate quantiles are not obtainable explicitly; thus we study these distributions asymptotically. We derive a Bernstein-von Mises theorem for the multivariate ℓ
1 -median with respect to a general ℓp -norm, showing that its posterior concentrates around its true value at the n-1/2 -rate, and that its credible sets have asymptotically correct frequentist coverages. In particular, the asymptotic normality results for the empirical multivariate median with a general ℓp -norm is also derived in the course of the proof, which extends the results from the case p = 2 in the literature to a general p. The technique involves approximating the posterior Dirichlet process using a Bayesian bootstrap process and deriving a conditional Donsker theorem. We also obtain analogous results for an affine equivariant version of the multivariate ℓ1 -median based on an adaptive transformation and re-transformation technique. The results are extended to a joint distribution of multivariate quantiles. The accuracy of the asymptotic result is confirmed using a simulation study. We also use the results to obtain Bayesian credible regions for the multivariate medians for Fisher's iris data, which consist of four features measured for each of three plant species. [ABSTRACT FROM AUTHOR]- Published
- 2022
- Full Text
- View/download PDF
21. Estimating the optimal timing of surgery by imputing potential outcomes.
- Author
-
Chen, Xiaofei, Heitjan, Daniel F., Greil, Gerald, and Jeon‐Slaughter, Haekyung
- Subjects
- *
HYPOPLASTIC left heart syndrome , *SURVIVAL rate , *CARDIAC surgery - Abstract
Hypoplastic left heart syndrome is a congenital anomaly that is uniformly fatal in infancy without immediate treatment. The standard treatment consists of an initial Norwood procedure (stage 1) followed some months later by stage 2 palliation (S2P). The ideal timing of the S2P is uncertain. The Single Ventricle Reconstruction Trial (SVRT) randomized the procedure used in the initial Norwood operation, leaving the timing of the S2P to the discretion of the surgical team. To estimate the causal effect of the timing of S2P, we propose to impute the potential post‐S2P survival outcomes using statistical models under the Rubin Causal Model framework. With this approach, it is straightforward to estimate the causal effect of S2P timing on post‐S2P survival by directly comparing the imputed potential outcomes. Specifically, we consider a lognormal model and a restricted cubic spline model, evaluating their performance in Monte Carlo studies. When applied to the SVRT data, the models give somewhat different imputed values, but both support the conclusion that the optimal time for the S2P is at 6 months after the Norwood procedure. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
22. Bayesian bootstrapping in real-time probabilistic photovoltaic power forecasting.
- Author
-
Bozorg, Mokhtar, Bracale, Antonio, Carpita, Mauro, De Falco, Pasquale, Mottola, Fabio, and Proto, Daniela
- Subjects
- *
QUANTILE regression , *ENERGY management , *PHOTOVOLTAIC power systems , *FORECASTING , *REGRESSION trees , *DECISION making - Abstract
• The paper explores Bayesian bootstrap in probabilistic photovoltaic power forecasting. • Bayesian bootstrap is applied to three probabilistic forecasting models. • The optimal quantile is extracted from the sample bootstrap distribution. • Bayesian bootstrap is compared with traditional bootstrap and with other benchmarks. • Numerical experiments validate the Bayesian bootstrap performance on real data. Modern distribution systems are characterized by increasing penetration of photovoltaic generation systems. Due to the uncertain nature of the solar primary source, photovoltaic power forecasting models must be developed in any energy management system for smart distribution networks. Although point forecasts can suit many scopes, probabilistic forecasts add further flexibility to any energy management system, and they are recommended to enable a wider range of decision making and optimization strategies. Real-time probabilistic photovoltaic power forecasting is performed in this paper by using an approach based on Bayesian bootstrap. Particularly, the Bayesian bootstrap is applied to three probabilistic forecasting models (i.e., linear quantile regression, gradient boosting regression tree and quantile regression neural network) to provide sample bootstrap distributions of the predictive quantiles of photovoltaic power. The heterogeneous nature of the selected models allows evaluating the performance of the Bayesian bootstrap within different forecasting frameworks. Several benchmarks and error indices and scores are used to assess the performance of Bayesian bootstrap in probabilistic photovoltaic power forecasting. Tests carried out on two actual photovoltaic power datasets for probabilistic forecasting demonstrates the effectiveness of the proposed approach. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
23. A resampling approach to estimation of the linking variance in the Fay–Herriot model
- Author
-
Snigdhansu Chatterjee
- Subjects
linking variance ,prasad–rao estimator ,paired bootstrap ,m-out-of-n bootstrap ,bayesian bootstrap ,Probabilities. Mathematical statistics ,QA273-280 - Abstract
In the Fay–Herriot model, we consider estimators of the linking variance obtained using different types of resampling schemes. The usefulness of this approach is that even when the estimator from the original data falls below zero or any other specified threshold, several of the resamples can potentially yield values above the threshold. We establish asymptotic consistency of the resampling-based estimator of the linking variance for a wide variety of resampling schemes and show the efficacy of using the proposed approach in numeric examples.
- Published
- 2019
- Full Text
- View/download PDF
24. Una propuesta bayesiana para la estimación de proporciones mediante el Jackknife en muestreo probabilóstico.
- Author
-
Nivia, Tania, Tellez, Cristian, and Pacheco, Mario
- Abstract
Copyright of Comunicaciones en Estadística is the property of Universidad Santo Tomas and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2021
- Full Text
- View/download PDF
25. A consistent bayesian bootstrap for chi-squared goodness-of-fit test using a dirichlet prior.
- Author
-
Hosseini, Reyhaneh and Zarepour, Mahmoud
- Subjects
- *
CONTINGENCY tables , *DATA distribution , *GOODNESS-of-fit tests , *CHI-squared test - Abstract
In this paper, we employ the Dirichlet process in a hypothesis testing framework to propose a Bayesian nonparametric chi-squared goodness-of-fit test. Our suggested method corresponds to Lo's Bayesian bootstrap procedure for chi-squared goodness of-fit test and rectifies some shortcomings of regular bootstrap which only counts number of observations falling in each bin in contingency tables. We consider the Dirichlet process as the prior for the distribution of the data and carry out the test based on the Kullback-Leibler distance between the updated Dirichlet process and the hypothesized distribution. Moreover, the results are generalized to chi-squared test of independence for a contingency table. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
26. Bayesian bootstrap adaptive lasso estimators of regression models.
- Author
-
Li, Bohan and Wu, Juan
- Subjects
- *
REGRESSION analysis , *LOGISTIC regression analysis , *STATISTICAL bootstrapping - Abstract
This paper proposes a modified adaptive lasso method by the Bayesian bootstrap (BBAL) and approximates the posterior distributions of parameters for a linear and a logistic regression model, respectively. The BBAL estimators are proved to have asymptotic and Oracle properties and they are acquired by the coordinate descent algorithm which could get the solutions at the grid of values of the penalty parameter λ. Three numerical experiments are conducted to demonstrate the BBAL method. Test results show the consistency of the variable selection and result in more robust estimators. And we use the median coefficients of the BBAL estimators to do the prediction with a medical dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
27. Partial factorial trials: comparing methods for statistical analysis and economic evaluation
- Author
-
Helen A. Dakin, Alastair M. Gray, Graeme S. MacLennan, Richard W. Morris, and David W. Murray
- Subjects
Randomised controlled trial ,Factorial design ,Cost-utility analysis ,Bayesian bootstrap ,Partial factorial trial ,Medicine (General) ,R5-920 - Abstract
Abstract Background Partial factorial trials compare two or more pairs of treatments on overlapping patient groups, randomising some (but not all) patients to more than one comparison. The aims of this research were to compare different methods for conducting and analysing economic evaluations on partial factorial trials and assess the implications of considering factors simultaneously rather than drawing independent conclusions about each comparison. Methods We estimated total costs and quality-adjusted life years (QALYs) within 10 years of surgery for 2252 patients in the Knee Arthroplasty Trial who were randomised to one or more comparisons of different surgical types. We compared three analytical methods: an “at-the-margins” analysis including all patients randomised to each comparison (assuming no interaction); an “inside-the-table” analysis that included interactions but focused on those patients randomised to two comparisons; and a Bayesian vetted bootstrap, which used results from patients randomised to one comparison as priors when estimating outcomes for patients randomised to two comparisons. Outcomes comprised incremental costs, QALYs and net benefits. Results Qualitative interactions were observed for costs, QALYs and net benefits. Bayesian bootstrapping generally produced smaller standard errors than inside-the-table analysis and gave conclusions that were consistent with at-the-margins analysis, while allowing for these interactions. By contrast, inside-the-table gave different conclusions about which intervention had the highest net benefits compared with other analyses. Conclusions All analyses of partial factorial trials should explore interactions and assess whether results are sensitive to assumptions about interactions, either as a primary analysis or as a sensitivity analysis. For partial factorial trials closely mirroring routine clinical practice, at-the-margins analysis may provide a reasonable estimate of average costs and benefits for the whole trial population, even in the presence of interactions. However, such conclusions will be misleading if there are large interactions or if the proportion of patients allocated to different treatments differs markedly from what occurs in clinical practice. The Bayesian bootstrap provides an alternative to at-the-margins analysis for analysing clinical or economic endpoints from partial factorial trials, which allows for interactions while making use of the whole sample. The same techniques could be applied to analyses of clinical endpoints. Trial registration ISRCTN, ISRCTN45837371. Registered on 25 April 2003.
- Published
- 2018
- Full Text
- View/download PDF
28. Hierarchical Bayesian models for continuous and positively skewed data from small areas.
- Author
-
Manandhar, Binod and Nandram, Balgobin
- Subjects
- *
TAYLOR'S series , *SMALL area statistics , *DISTRIBUTION (Probability theory) , *GAMMA distributions , *GAUSSIAN distribution , *GIBBS sampling - Abstract
There are numerous types of continuous and positively skewed data. We develop a hierarchical Bayesian generalized gamma regression model for continuous and positively skewed data. For skewed data, the log-transformation is one of the widely used transformation to meet the normality assumption. However, log-transformation could be problematic, so we use a generalized gamma distribution instead. Because the posterior distribution corresponding to this model is complex, we have used a second order Taylor's series to obtain approximate multivariate normal distributions, which provide proposal densities for Metropolis samplers. We have applied our models to consumption data from a national household survey and chose the best-fitted model among the generalized gamma distribution and two special cases, the gamma and the exponential distributions. It is then linked to census data to provide small area estimates for poverty indicators. A simulation study shows that our methodology is reasonable. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
29. Applications of the Fractional-Random-Weight Bootstrap.
- Author
-
Xu, Li, Gotwalt, Chris, Hong, Yili, King, Caleb B., and Meeker, William Q.
- Subjects
- *
CONFIDENCE intervals , *FORECASTING - Abstract
For several decades, the resampling based bootstrap has been widely used for computing confidence intervals (CIs) for applications where no exact method is available. However, there are many applications where the resampling bootstrap method cannot be used. These include situations where the data are heavily censored due to the success response being a rare event, situations where there is insufficient mixing of successes and failures across the explanatory variable(s), and designed experiments where the number of parameters is close to the number of observations. These three situations all have in common that there may be a substantial proportion of the resamples where it is not possible to estimate all of the parameters in the model. This article reviews the fractional-random-weight bootstrap method and demonstrates how it can be used to avoid these problems and construct CIs in a way that is accessible to statistical practitioners. The fractional-random-weight bootstrap method is easy to use and has advantages over the resampling method in many challenging applications. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
30. Functional modeling of pedaling kinematics for the Stroke patients.
- Author
-
Chakraborty, Sounak, Dey, Tanujit, Mukherjee, Anish, Alberts, Jay L., and Linder, Susan M.
- Subjects
- *
STROKE patients , *KINEMATICS , *HUMAN kinematics , *COOLDOWN , *TORQUE - Abstract
Understanding deficits in motor control through the analysis of pedaling biomechanics plays a key role in the treatment of stroke patients. A thorough study of the impact of different exercise patterns and workloads on the change between pre- and post-treatment movement patterns in the patients is therefore of utmost importance to the clinicians. The objective of this study was to analyze the difference between pre- and post-treatment pedaling torques when the patients are subject to different exercise groups with varying workloads. The effects of affected vs unaffected side along with the covariates age and BMI have also been accounted for in this work. Two different three-way ANOVA-based approaches have been implemented here. In the first approach, a random projection-based ANOVA technique has been performed treating the pedaling torques as functional response, whereas the second approach utilizes distance measures to summarize the difference between pre- and post-treatment torques and perform nonparametric tests on it. Bayesian bootstrap has been used here to perform tests on the median distance. A group of stroke patients have been studied in the Cleveland Clinic categorizing them into different exercise groups and workload patterns. The data obtained have been analyzed with the aforementioned techniques, and the results have been reported here. These techniques turn out to be promising and will help clinicians recommend personalized treatment to stroke patients for optimal results. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
31. Lossless online Bayesian bagging
- Author
-
Lee, HKH and Clyde, MA
- Subjects
classification tree ,Bayesian bootstrap ,Dirichlet distribution ,Artificial Intelligence & Image Processing ,Information and Computing Sciences ,Psychology and Cognitive Sciences - Published
- 2004
32. Bayesian Nonparametric Approaches for ROC Curve Inference
- Author
-
de Carvalho, Vanda Inácio, Jara, Alejandro, de Carvalho, Miguel, Datta, Somnath, Editor-in-chief, Viens, Frederi G., Series editor, Politis, Dimitris N., Series editor, Oja, Hannu, Series editor, Daniels, Michael, Series editor, Mitra, Riten, editor, and Müller, Peter, editor
- Published
- 2015
- Full Text
- View/download PDF
33. Bayesian nonparametric estimation of ROC surface under verification bias.
- Author
-
Zhu, Rui and Ghosal, Subhashis
- Subjects
- *
NONPARAMETRIC estimation , *OVARIAN epithelial cancer , *RECEIVER operating characteristic curves , *SERUM albumin - Abstract
The receiver operating characteristic (ROC) surface, as a generalization of the ROC curve, has been widely used to assess the accuracy of a diagnostic test for three categories. A common problem is verification bias, referring to the situation where not all subjects have their true classes verified. In this paper, we consider the problem of estimating the ROC surface under verification bias. We adopt a Bayesian nonparametric approach by directly modeling the underlying distributions of the three categories by Dirichlet process mixture priors. We propose a robust computing algorithm by only imposing a missing at random assumption for the verification process but no assumption on the distributions. The method can also accommodate covariates information in estimating the ROC surface, which can lead to a more comprehensive understanding of the diagnostic accuracy. It can be adapted and hugely simplified in the case where there is no verification bias, and very fast computation is possible through the Bayesian bootstrap process. The proposed method is compared with other commonly used methods by extensive simulations. We find that the proposed method generally outperforms other approaches. Applying the method to two real datasets, the key findings are as follows: (1) human epididymis protein 4 has a slightly better diagnosis ability compared to CA125 in discriminating healthy, early stage, and late stage patients of epithelial ovarian cancer. (2) Serum albumin has a prognostic ability in distinguishing different stages of hepatocellular carcinoma. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
34. General Bayesian updating and the loss-likelihood bootstrap.
- Author
-
Lyddon, S P, Holmes, C C, and Walker, S G
- Subjects
- *
FISHER information , *COST functions , *STATISTICAL bootstrapping , *PARAMETRIC modeling , *BAYESIAN analysis - Abstract
In this paper we revisit the weighted likelihood bootstrap, a method that generates samples from an approximate Bayesian posterior of a parametric model. We show that the same method can be derived, without approximation, under a Bayesian nonparametric model with the parameter of interest defined through minimizing an expected negative loglikelihood under an unknown sampling distribution. This interpretation enables us to extend the weighted likelihood bootstrap to posterior sampling for parameters minimizing an expected loss. We call this method the loss-likelihood bootstrap, and we make a connection between it and general Bayesian updating, which is a way of updating prior belief distributions that does not need the construction of a global probability model, yet requires the calibration of two forms of loss function. The loss-likelihood bootstrap is used to calibrate the general Bayesian posterior by matching asymptotic Fisher information. We demonstrate the proposed method on a number of examples. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
35. Robust Bayesian Regression with Synthetic Posterior Distributions
- Author
-
Shintaro Hashimoto and Shonosuke Sugasawa
- Subjects
Bayesian bootstrap ,Bayesian lasso ,divergence ,Gibbs sampling ,linear regression ,Science ,Astrophysics ,QB460-466 ,Physics ,QC1-999 - Abstract
Although linear regression models are fundamental tools in statistical science, the estimation results can be sensitive to outliers. While several robust methods have been proposed in frequentist frameworks, statistical inference is not necessarily straightforward. We here propose a Bayesian approach to robust inference on linear regression models using synthetic posterior distributions based on γ-divergence, which enables us to naturally assess the uncertainty of the estimation through the posterior distribution. We also consider the use of shrinkage priors for the regression coefficients to carry out robust Bayesian variable selection and estimation simultaneously. We develop an efficient posterior computation algorithm by adopting the Bayesian bootstrap within Gibbs sampling. The performance of the proposed method is illustrated through simulation studies and applications to famous datasets.
- Published
- 2020
- Full Text
- View/download PDF
36. Exchangeably weighted bootstrap schemes
- Author
-
van Kerm, Philippe
- Subjects
Méthodes quantitatives en économie & gestion [B09] [Sciences économiques & de gestion] ,Quantitative methods in economics & management [B09] [Business & economic sciences] ,bootstrap ,stata ,bayesian bootstrap - Abstract
The exchangeably weighted bootstrap is one of the many variants of bootstrap resampling schemes. Rather than directly drawing observations with replacement from the data, weighted bootstrap schemes generate vectors of replication weights to form bootstrap replications. Various ways to generate the replication weights can be adopted, and some choices bring practical computational advantages. This presentation demonstrates how easily such schemes can be implemented and where they are particularly useful, and introduces the exbsample command, which facilitates their implementation.
- Published
- 2022
37. Bayesian bootstrap aggregation for tourism demand forecasting
- Author
-
Xinyang Liu, Gang Li, Haiyan Song, and Anyu Liu
- Subjects
Bayesian bootstrap ,Tourism demand forecasting ,Tourism, Leisure and Hospitality Management ,Geography, Planning and Development ,Bayesian probability ,General to specific ,Econometrics ,Economics ,Transportation ,Nature and Landscape Conservation - Published
- 2021
- Full Text
- View/download PDF
38. Cure modeling in real-time prediction: How much does it help?
- Author
-
Ying, Gui-shuang, Zhang, Qiang, Lan, Yu, Li, Yimei, and Heitjan, Daniel F.
- Subjects
- *
PARAMETRIC modeling , *WEIBULL distribution , *COMPUTER-aided design , *RANDOMIZED controlled trials , *STATISTICAL bootstrapping - Abstract
Various parametric and nonparametric modeling approaches exist for real-time prediction in time-to-event clinical trials. Recently, Chen (2016 BMC Biomedical Research Methodology 16 ) proposed a prediction method based on parametric cure-mixture modeling, intending to cover those situations where it appears that a non-negligible fraction of subjects is cured. In this article we apply a Weibull cure-mixture model to create predictions, demonstrating the approach in RTOG 0129, a randomized trial in head-and-neck cancer. We compare the ultimate realized data in RTOG 0129 to interim predictions from a Weibull cure-mixture model, a standard Weibull model without a cure component, and a nonparametric model based on the Bayesian bootstrap. The standard Weibull model predicted that events would occur earlier than the Weibull cure-mixture model, but the difference was unremarkable until late in the trial when evidence for a cure became clear. Nonparametric predictions often gave undefined predictions or infinite prediction intervals, particularly at early stages of the trial. Simulations suggest that cure modeling can yield better-calibrated prediction intervals when there is a cured component, or the appearance of a cured component, but at a substantial cost in the average width of the intervals. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
39. Characterising uncertainty in generalised dissimilarity models.
- Author
-
Woolley, Skipton N.C., Foster, Scott D., O'Hara, Timothy D., Wintle, Brendan A., Dunstan, Piers K., and Hodgson, David
- Subjects
STATISTICS ,SPECIES ,NATURE & nurture ,RISK assessment ,LINEAR statistical models - Abstract
Generalised dissimilarity modelling (GDM) is a statistical method for analysing and predicting patterns of turnover in species composition, usually in response to environmental gradients that vary in space and time. GDM is becoming widely applied in ecology and conservation science to interpret macro-ecological and biogeographical patterns, to support conservation assessment, predict changes in species distributions under climate change and prioritise biological surveys., Inferential and predictive uncertainty is difficult to characterise using current implementations of GDM, reducing the utility of GDM in ecological risk assessment and conservation decision-making. Current practice is to undertake permutation tests to assess the importance of variables in GDM. Permutation testing overcomes the issue of data-dependence (because dissimilarities are calculated on a smaller number of observations) but it does not give a quantification of uncertainty in predictions. Here, we address this issue by utilising the Bayesian bootstrap, so that the uncertainty in the observations is carried through the entire analysis (including into the predictions)., We tested our Bayesian bootstrap GDM (BBGDM) approach on simulated data sets and two benthic species data sets. We fitted BBGDMs and GDMs to compare the differences in inference and prediction of compositional turnover that resulted from a coherent treatment of model uncertainty. We showed that our BBGDM approach correctly identified the signal within the data, resulting in an improved characterisation of uncertainty and enhanced model-based inference., We show that our approach gives appropriate parameter estimates while better representing the underlying uncertainty that arises when conducting inference and making predictions with GDMs. Our approach to fitting GDMs will provide more realistic insights into parameter and prediction uncertainty. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
40. A stochastic Bayesian bootstrapping model for COVID-19 data
- Author
-
Julia Calatayud, Marc Jornet, and Jorge Mateu
- Subjects
Original Paper ,Random parameters and errors ,Environmental Engineering ,COVID-19 reported infections and waves ,multiple generalized logistic growth curves ,deterministic and stochastic modeling ,random parameters and errors ,least-squares fitting ,Least-squares fitting ,Deterministic and stochastic modeling ,Multiple generalized logistic growth curves ,Environmental Chemistry ,Bayesian bootstrap ,Safety, Risk, Reliability and Quality ,General Environmental Science ,Water Science and Technology - Abstract
We provide a stochastic modeling framework for the incidence of COVID-19 in Castilla-Leon (Spain) for the period March 1, 2020 to February 12, 2021, which encompasses four waves. Each wave is appropriately described by a generalized logistic growth curve. Accordingly, the four waves are modeled through a sum of four generalized logistic growth curves. Pointwise values of the twenty input parameters are fitted by a least-squares optimization procedure. Taking into account the significant variability in the daily reported cases, the input parameters and the errors are regarded as random variables on an abstract probability space. Their probability distributions are inferred from a Bayesian bootstrap procedure. This framework is shown to offer a more accurate estimation of the COVID-19 reported cases than the deterministic formulation.
- Published
- 2022
41. A Nonparametric Bayesian Analysis of Heterogenous Treatment Effects in Digital Experimentation.
- Author
-
Taddy, Matt, Gardner, Matt, Chen, Liyun, and Draper, David
- Subjects
BAYESIAN analysis ,RANDOMIZED controlled trials ,INTERNET service providers ,REGRESSION analysis ,STATISTICAL bootstrapping ,BIG data - Abstract
Randomized controlled trials play an important role in how Internet companies predict the impact of policy decisions and product changes. In these “digital experiments,” different units (people, devices, products) respond differently to the treatment. This article presents a fast and scalable Bayesian nonparametric analysis of such heterogenous treatment effects and their measurement in relation to observable covariates. New results and algorithms are provided for quantifying the uncertainty associated with treatment effect measurement via both linear projections and nonlinear regression trees (CART and random forests). For linear projections, our inference strategy leads to results that are mostly in agreement with those from the frequentist literature. We find that linear regression adjustment of treatment effect averages (i.e., post-stratification) can provide some variance reduction, but that this reduction will be vanishingly small in the low-signal and large-sample setting of digital experiments. For regression trees, we provide uncertainty quantification for the machine learning algorithms that are commonly applied in tree-fitting. We argue that practitioners should look to ensembles of trees (forests) rather than individual trees in their analysis. The ideas are applied on and illustrated through an example experiment involving 21 million unique users of EBay.com. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
42. A Two-Step Semiparametric Method to Accommodate Sampling Weights in Multiple Imputation.
- Author
-
Zhou, Hanzhi, Elliott, Michael R., and Raghunathan, Trviellore E.
- Subjects
- *
MULTIPLE imputation (Statistics) , *BAYESIAN analysis , *REGRESSION analysis , *COEFFICIENTS (Statistics) , *SIMULATION methods & models - Abstract
Multiple imputation (MI) is a well-established method to handle item-nonresponse in sample surveys. Survey data obtained from complex sampling designs often involve features that include unequal probability of selection. MI requires imputation to be congenial, that is, for the imputations to come from a Bayesian predictive distribution and for the observed and complete data estimator to equal the posterior mean given the observed or complete data, and similarly for the observed and complete variance estimator to equal the posterior variance given the observed or complete data; more colloquially, the analyst and imputer make similar modeling assumptions. Yet multiply imputed data sets from complex sample designs with unequal sampling weights are typically imputed under simple random sampling assumptions and then analyzed using methods that account for the sampling weights. This is a setting in which the analyst assumes more than the imputer, which can led to biased estimates and anti-conservative inference. Less commonly used alternatives such as including case weights as predictors in the imputation model typically require interaction terms for more complex estimators such as regression coefficients, and can be vulnerable to model misspecification and difficult to implement. We develop a simple two-step MI framework that accounts for sampling weights using a weighted finite population Bayesian bootstrap method to validly impute the whole population (including item nonresponse) from the observed data. In the second step, having generated posterior predictive distributions of the entire population, we use standard IID imputation to handle the item nonresponse. Simulation results show that the proposed method has good frequentist properties and is robust to model misspecification compared to alternative approaches. We apply the proposed method to accommodate missing data in the Behavioral Risk Factor Surveillance System when estimating means and parameters of regression models. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
43. Accidental explosions on the railway: simulation-based prediction of damage to nearby buildings
- Author
-
E. R. Vaidogas
- Subjects
railway ,accident ,explosion ,Monte Carlo simulation ,bootstrap ,Bayesian bootstrap ,damage ,Transportation engineering ,TA1001-1280 - Abstract
A procedure for estimating potential damage to buildings induced by accidental explosions on the railway is developed. By the damage failures of nearby structures due to actions generated by the accidental explosions are meant. This damage is measured in terms of probabilities of potential failures caused by explosions. The estimation of the damage probabilities is based on stochastic simulation of railway accidents involving an explosion. The proposed simulation-based procedure quantifies epistemic (state-of-knowledge) uncertainties in the damage probabilities. These uncertainties are expressed in terms of Bayesian prior and posterior distributions. The foundation of the procedure is a computer intensive method known as the Bayesian bootstrap. It is used for approximating the posterior distributions of damage probabilities. The application of the Bayesian bootstrap makes the proposed procedure highly automatic and convenient for assessing structures subjected to the hazard of the accidental actions. In addition, it can be used for specifying safe distances between the railway and nearby buildings. Structures of these buildings can be designed for tolerable probabilities of failures induced by accidental explosions.
- Published
- 2005
44. Bootstrap Inference for Garch Models by the Least Absolute Deviation Estimation
- Author
-
Guodong Li, Qianqian Zhu, and Ruochen Zeng
- Subjects
Statistics and Probability ,Estimation ,Statistics::Theory ,Heteroscedasticity ,Applied Mathematics ,Autoregressive conditional heteroskedasticity ,Inference ,Bayesian bootstrap ,Autoregressive model ,Portmanteau test ,Statistics::Methodology ,Applied mathematics ,Least absolute deviations ,Statistics, Probability and Uncertainty ,Mathematics - Abstract
This article considers the generalized bootstrap method to approximate the least absolute deviation estimation and portmanteau test for generalized autoregressive conditional heteroskedastic models. The generalized bootstrap approach is easy‐to‐implement, and includes many bootstrap methods as special cases, such as Efron's bootstrap, Bayesian bootstrap, and random‐weighting bootstrap. The proposed bootstrap procedure is shown to be asymptotically valid for both estimation and test. The finite‐sample performance is assessed by simulation studies, and its usefulness is illustrated by a real application to the Hang Seng Index.
- Published
- 2019
- Full Text
- View/download PDF
45. BAYESIAN INFERENCE FOR THE FINITE POPULATION TOTAL FROM A HETEROSCEDASTIC PROBABILITY PROPORTIONAL TO SIZE SAMPLE.
- Author
-
ZANGENEH, SAHAR Z. and LITTLE, RODERICK J. A.
- Subjects
- *
STATISTICAL sampling , *PROBABILITY theory , *REGRESSION analysis , *ESTIMATION theory , *ANALYSIS of variance - Abstract
Inference for the population total from probability proportional to size (PPS) sampling provides a comparison of design-based and model-based approaches to survey inference, for an important practical design. The usual design-based approach weights sampled units by the inverse of their inclusion probabilities, using the Horvitz-Thompson or Hajek estimates. The model-based approach predicts the outcome for nonsampled units based on a regression of the outcome on the size variable. Zheng and Little (2003) showed that this regression approach, based on a flexible penalized spline regression model, can provide superior inferences to Horvitz-Thompson or generalized regression, in terms of both precision and confidence coverage. However, the sizes of non-sampled units are exploited in this approach, and this information is rarely included in public-use data files. Little and Zheng (2007) showed that when the sizes of non-sampled units are not available, the spline model, combined with a Bayesian bootstrap (BB) model for predicting the nonsampled sizes, can still provide superior inferences, though gains were reduced and less consistent. We further develop these methods by (a) including an unknown parameter to model heteroscedastic error variance in the spline model, an important modeling feature in the PPS setting; and (b) providing an improved Bayesian method for including summary information about the aggregate size of non-sampled units. Simulation studies suggest that the resulting Bayesian method, which includes information on the number and total size of the non-sampled units, recovers most of the information in the individual sizes of the non-sampled units, and provides significant gains over the traditional Horvitz-Thompson estimator. The method is applied to two public-use data sets from the U.S. Census Bureau as well as a data set from the U.S. Energy Information Administration. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
46. Machines Imitating Human Thinking Using Bayesian Learning and Bootstrap
- Author
-
Sunghae Jun
- Subjects
Physics and Astronomy (miscellaneous) ,Computer science ,Process (engineering) ,General Mathematics ,Bayesian probability ,02 engineering and technology ,Bayesian inference ,0504 sociology ,0202 electrical engineering, electronic engineering, information engineering ,Computer Science (miscellaneous) ,Bayes estimator ,Bayesian learning ,business.industry ,Deep learning ,lcsh:Mathematics ,020208 electrical & electronic engineering ,05 social sciences ,prior and posterior ,050401 social sciences methods ,artificial intelligence ,lcsh:QA1-939 ,Bayesian statistics ,ComputingMethodologies_PATTERNRECOGNITION ,Chemistry (miscellaneous) ,Artificial intelligence ,Applications of artificial intelligence ,Bayesian bootstrap ,business ,Optimal decision ,thinking machines - Abstract
In the field of cognitive science, much research has been conducted on the diverse applications of artificial intelligence (AI). One important area of study is machines imitating human thinking. Although there are various approaches to development of thinking machines, we assume that human thinking is not always optimal in this paper. Sometimes, humans are driven by emotions to make decisions that are not optimal. Recently, deep learning has been dominating most machine learning tasks in AI. In the area of optimal decisions involving AI, many traditional machine learning methods are rapidly being replaced by deep learning. Therefore, because of deep learning, we can expect the faster growth of AI technology such as AlphaGo in optimal decision-making. However, humans sometimes think and act not optimally but emotionally. In this paper, we propose a method for building thinking machines imitating humans using Bayesian decision theory and learning. Bayesian statistics involves a learning process based on prior and posterior aspects. The prior represents an initial belief in a specific domain. This is updated to posterior through the likelihood of observed data. The posterior refers to the updated belief based on observations. When the observed data are newly added, the current posterior is used as a new prior for the updated posterior. Bayesian learning such as this also provides an optimal decision, thus, this is not well-suited to the modeling of thinking machines. Therefore, we study a new Bayesian approach to developing thinking machines using Bayesian decision theory. In our research, we do not use a single optimal value expected by the posterior, instead, we generate random values from the last updated posterior to be used for thinking machines that imitate human thinking.
- Published
- 2021
47. Averaging Predictions of Rate-Time Models Using Bayesian Leave-Future-Out Cross-Validation and the Bayesian Bootstrap in Probabilistic Unconventional Production Forecasting
- Author
-
Larry W. Lake, Leopoldo M. Ruiz Maraggi, and Mark P. Walsh
- Subjects
Bayesian bootstrap ,Production forecasting ,Computer science ,Bayesian probability ,Probabilistic logic ,Econometrics ,Cross-validation - Published
- 2021
- Full Text
- View/download PDF
48. Bayesian bootstrapping in real-time probabilistic photovoltaic power forecasting
- Author
-
Mauro Carpita, Mokhtar Bozorg, Fabio Mottola, Daniela Proto, Antonio Bracale, Pasquale De Falco, Bozorg, M., Bracale, A., Carpita, M., De Falco, P., Mottola, F., and Proto, D.
- Subjects
Bayesian bootstrap, Photovoltaic power forecasting, Probabilistic forecasting, Renewable energy ,Statistics::Theory ,Renewable energy ,Renewable Energy, Sustainability and the Environment ,business.industry ,Computer science ,Probabilistic forecasting ,Bayesian probability ,Probabilistic logic ,Decision tree ,Machine learning ,computer.software_genre ,Quantile regression ,Photovoltaic power forecasting ,Bootstrapping (electronics) ,Statistics::Methodology ,General Materials Science ,Artificial intelligence ,Gradient boosting ,Bayesian bootstrap ,business ,computer ,Quantile - Abstract
Modern distribution systems are characterized by increasing penetration of photovoltaic generation systems. Due to the uncertain nature of the solar primary source, photovoltaic power forecasting models must be developed in any energy management system for smart distribution networks. Although point forecasts can suit many scopes, probabilistic forecasts add further flexibility to any energy management system, and they are recommended to enable a wider range of decision making and optimization strategies. Real-time probabilistic photovoltaic power forecasting is performed in this paper by using an approach based on Bayesian bootstrap. Particularly, the Bayesian bootstrap is applied to three probabilistic forecasting models (i.e., linear quantile regression, gradient boosting regression tree and quantile regression neural network) to provide sample bootstrap distributions of the predictive quantiles of photovoltaic power. The heterogeneous nature of the selected models allows evaluating the performance of the Bayesian bootstrap within different forecasting frameworks. Several benchmarks and error indices and scores are used to assess the performance of Bayesian bootstrap in probabilistic photovoltaic power forecasting. Tests carried out on two actual photovoltaic power datasets for probabilistic forecasting demonstrates the effectiveness of the proposed approach.
- Published
- 2021
49. Approximate Bayesian Bootstrap procedures to estimate multilevel treatment effects in observational studies with application to type 2 diabetes treatment regimens
- Author
-
Roee Gutman, Robert J. Smith, Andrew R. Zullo, and Anthony D. Scotina
- Subjects
Statistics and Probability ,FOS: Computer and information sciences ,medicine.medical_specialty ,Epidemiology ,Psychological intervention ,Type 2 diabetes ,01 natural sciences ,law.invention ,Methodology (stat.ME) ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Health Information Management ,Randomized controlled trial ,law ,medicine ,Humans ,030212 general & internal medicine ,0101 mathematics ,Intensive care medicine ,Adverse effect ,Propensity Score ,Statistics - Methodology ,business.industry ,Bayes Theorem ,Gold standard (test) ,medicine.disease ,Causality ,Bayesian bootstrap ,Diabetes Mellitus, Type 2 ,Causal inference ,Observational study ,business ,Algorithms - Abstract
Randomized clinical trials are considered as the gold standard for estimating causal effects. Nevertheless, in studies that are aimed at examining adverse effects of interventions, randomized trials are often impractical because of ethical and financial considerations. In observational studies, matching on the generalized propensity scores was proposed as a possible solution to estimate the treatment effects of multiple interventions. However, the derivation of point and interval estimates for these matching procedures can become complex with non-continuous or censored outcomes. We propose a novel Approximate Bayesian Bootstrap algorithm that results in statistically valid point and interval estimates of the treatment effects with categorical outcomes. The procedure relies on the estimated generalized propensity scores and multiply imputes the unobserved potential outcomes for each unit. In addition, we describe a corresponding interpretable sensitivity analysis to examine the unconfoundedness assumption. We apply this approach to examine the cardiovascular safety of common, real-world anti-diabetic treatment regimens for type 2 diabetes mellitus in a large observational database.
- Published
- 2020
50. Robust Bayesian Regression with Synthetic Posterior Distributions
- Author
-
Shonosuke Sugasawa and Shintaro Hashimoto
- Subjects
Computer science ,Posterior probability ,Bayesian probability ,General Physics and Astronomy ,Inference ,lcsh:Astrophysics ,01 natural sciences ,Article ,010305 fluids & plasmas ,010104 statistics & probability ,symbols.namesake ,Gibbs sampling ,Frequentist inference ,lcsh:QB460-466 ,0103 physical sciences ,Prior probability ,Statistical inference ,Statistics::Methodology ,0101 mathematics ,lcsh:Science ,lcsh:QC1-999 ,Statistics::Computation ,Bayesian lasso ,ComputingMethodologies_PATTERNRECOGNITION ,symbols ,linear regression ,lcsh:Q ,Bayesian bootstrap ,Bayesian linear regression ,Algorithm ,divergence ,lcsh:Physics - Abstract
Although linear regression models are fundamental tools in statistical science, the estimation results can be sensitive to outliers. While several robust methods have been proposed in frequentist frameworks, statistical inference is not necessarily straightforward. We here propose a Bayesian approach to robust inference on linear regression models using synthetic posterior distributions based on &gamma, divergence, which enables us to naturally assess the uncertainty of the estimation through the posterior distribution. We also consider the use of shrinkage priors for the regression coefficients to carry out robust Bayesian variable selection and estimation simultaneously. We develop an efficient posterior computation algorithm by adopting the Bayesian bootstrap within Gibbs sampling. The performance of the proposed method is illustrated through simulation studies and applications to famous datasets.
- Published
- 2020
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.