728 results on '"Laplace approximation"'
Search Results
2. Laplace-based strategies for Bayesian optimal experimental design with nuisance uncertainty.
- Author
-
Bartuska, Arved, Espath, Luis, and Tempone, Raúl
- Abstract
Finding the optimal design of experiments in the Bayesian setting typically requires estimation and optimization of the expected information gain functional. This functional consists of one outer and one inner integral, separated by the logarithm function applied to the inner integral. When the mathematical model of the experiment contains uncertainty about the parameters of interest and nuisance uncertainty, (i.e., uncertainty about parameters that affect the model but are not themselves of interest to the experimenter), two inner integrals must be estimated. Thus, the already considerable computational effort required to determine good approximations of the expected information gain is increased further. The Laplace approximation has been applied successfully in the context of experimental design in various ways, and we propose two novel estimators featuring the Laplace approximation to alleviate the computational burden of both inner integrals considerably. The first estimator applies Laplace’s method followed by a Laplace approximation, introducing a bias. The second estimator uses two Laplace approximations as importance sampling measures for Monte Carlo approximations of the inner integrals. Both estimators use Monte Carlo approximation for the remaining outer integral estimation. We provide four numerical examples demonstrating the applicability and effectiveness of our proposed estimators. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
3. Inference for the normal coefficient of variation: an approximate marginal likelihood approach.
- Author
-
Wong, A. and Shi, X.
- Abstract
In applied statistics, the coefficient of variation (CV) is commonly reported as a measure of relative variability of the data. However, confidence intervals for CV are rarely reported because the existing methodologies are either simple to understand but do not give accurate results or vice versa. In this paper, we assumed data are from a normal population, and an approximate marginal likelihood function for the normal CV is derived from the Studentization method, and a Bartlett-type correction method is proposed to obtain accurate inferential for the normal CV. Moreover, if the populations are measured using different scales, comparing CVs among the populations is a more appropriate way to determine if the variability among the populations is heterogeneous. The proposed Bartlett-type correction method is extended to test if the normal CVs are homogeneous for two or more normal populations. Numerical examples are given to illustrate the accuracy of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
4. Use of cross-validation Bayes factors to test equality of two densities.
- Author
-
Merchant, Naveed, Hart, Jeffrey D., Kim, Minhyeok, and Choi, Taeryon
- Subjects
- *
HIGGS bosons , *INFORMATION sharing , *BANDWIDTHS , *DENSITY , *TREES - Abstract
We propose a nonparametric, two-sample Bayesian test for checking whether or not two data sets share a common distribution. The test makes use of data splitting ideas and requires only simple priors for the bandwidths of two kernel density estimates. Importantly, it does not require priors for high- or infinite-dimensional parameter vectors, as do other nonparametric Bayesian procedures. We provide evidence that the new procedure leads to more stable Bayes factors than do methods based on Pólya trees. Somewhat surprisingly, the behaviour of the proposed Bayes factors when the two distributions are the same is usually superior to that of Pólya tree Bayes factors. We showcase the effectiveness of the test by proving its consistency, conducting a simulation study and applying the test to Higgs Boson data. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
5. Precise Laplace approximation for mixed rough differential equation.
- Author
-
Yang, Xiaoyu, Xu, Yong, and Pei, Bin
- Subjects
- *
HILBERT space , *DIFFERENTIAL equations , *LARGE deviations (Mathematics) , *HESSIAN matrices , *PROBABILITY theory - Abstract
This work focuses on the Laplace approximation for the rough differential equation (RDE) driven by mixed rough path (B H , W) with H ∈ (1 / 3 , 1 / 2) as ε → 0. Firstly, based on geometric rough path lifted from mixed fractional Brownian motion (fBm), the Schilder-type large deviation principle (LDP) for the law of the first level path of the solution to the RDE is given. Due to the particularity of mixed rough path, the main difficulty in carrying out the Laplace approximation is to prove the Hilbert-Schmidt property for the Hessian matrix of the Itô map restricted on the Cameron-Martin space of the mixed fBm. To this end, we embed the Cameron-Martin space into a larger Hilbert space, then the Hessian is computable. Subsequently, the probability representation for the Hessian is shown. Finally, the Laplace approximation is constructed, which asserts the more precise asymptotics in the exponential scale. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
6. Bayesian Partial Reduced-Rank Regression.
- Author
-
Pintado, Maria F., Iacopini, Matteo, Rossini, Luca, and Shestopaloff, Alexander Y.
- Subjects
- *
REGRESSION analysis , *MATRICES (Mathematics) - Abstract
AbstractReduced-rank (RR) regression may be interpreted as a dimensionality reduction technique able to reveal complex relationships among the data parsimoniously. However, RR regression models typically overlook any potential group structure among the responses by assuming a low-rank structure on the coefficient matrix. To address this limitation, a Bayesian Partial RR (BPRR) regression is exploited, where the response vector and the coefficient matrix are partitioned into low- and full-rank sub-groups. As opposed to the literature, which assumes known group structure and rank, a novel strategy is introduced that treats them as unknown parameters to be estimated.The main contribution is two-fold: an approach to infer the low- and full-rank group memberships from the data is proposed, and then, conditionally on this allocation, the corresponding (reduced) rank is estimated. Both steps are carried out in a Bayesian approach, allowing for full uncertainty quantification and based on a partially collapsed Gibbs sampler. It relies on a Laplace approximation of the marginal likelihood and the Metropolized Shotgun Stochastic Search to estimate the group allocation efficiently. Applications to synthetic and real-world data reveal the potential of the proposed method to reveal hidden structures in the data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Bayesian Inference on High-Dimensional Multivariate Binary Responses.
- Author
-
Chakraborty, Antik, Ou, Rihui, and Dunson, David B.
- Subjects
- *
MARGINAL distributions , *LAPLACE distribution , *GAUSSIAN distribution , *SPECIES distribution , *PARAMETER estimation - Abstract
It has become increasingly common to collect high-dimensional binary response data; for example, with the emergence of new sampling techniques in ecology. In smaller dimensions, multivariate probit (MVP) models are routinely used for inferences. However, algorithms for fitting such models face issues in scaling up to high dimensions due to the intractability of the likelihood, involving an integral over a multivariate normal distribution having no analytic form. Although a variety of algorithms have been proposed to approximate this intractable integral, these approaches are difficult to implement and/or inaccurate in high dimensions. Our main focus is in accommodating high-dimensional binary response data with a small-to-moderate number of covariates. We propose a two-stage approach for inference on model parameters while taking care of uncertainty propagation between the stages. We use the special structure of latent Gaussian models to reduce the highly expensive computation involved in joint parameter estimation to focus inference on marginal distributions of model parameters. This essentially makes the method embarrassingly parallel for both stages. We illustrate performance in simulations and applications to joint species distribution modeling in ecology. for this article are available online. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Efficacy assessment in crop protection: a tutorial on the use of Abbott's formula.
- Author
-
Piepho, Hans-Peter, Malik, Waqas Ahmed, Bischoff, Robert, El-Hasan, Abbas, Scheer, Christian, Sedlmeier, Jan Erik, Gerhards, Roland, Petschenka, Georg, and Voegele, Ralf T.
- Subjects
- *
WEED science , *LEAST squares , *ANALYSIS of variance , *BLOCK designs , *PLANT diseases - Abstract
In 1925, the American entomologist Walter Sidney Abbott proposed an equation for assessing efficacy, and it is still widely used today for analysing controlled experiments in crop protection and phytomedicine. Typically, this equation is applied to each experimental unit and the efficacy estimates thus obtained are then used in analysis of variance and least squares regression procedures. However, particularly regarding the common assumptions of homogeneity of variance and normality, this approach is often inaccurate. In this tutorial paper, we therefore revisit Abbott's equation and outline an alternative route to analysis via generalized linear mixed models that can satisfactorily deal with these distributional issues. Nine examples from entomology, weed science and phytopathology, each with a different focus and methodological peculiarity, are used to illustrate the framework. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Fast estimation of generalized linear latent variable models for performance and process data with ordinal, continuous, and count observed variables.
- Author
-
Zhang, Maoxin, Andersson, Björn, and Jin, Shaobo
- Subjects
- *
PSYCHOMETRICS , *EDUCATIONAL tests & measurements , *DATA modeling , *STIMULUS & response (Psychology) - Abstract
Different data types often occur in psychological and educational measurement such as computer‐based assessments that record performance and process data (e.g., response times and the number of actions). Modelling such data requires specific models for each data type and accommodating complex dependencies between multiple variables. Generalized linear latent variable models are suitable for modelling mixed data simultaneously, but estimation can be computationally demanding. A fast solution is to use Laplace approximations, but existing implementations of joint modelling of mixed data types are limited to ordinal and continuous data. To address this limitation, we derive an efficient estimation method that uses first‐ or second‐order Laplace approximations to simultaneously model ordinal data, continuous data, and count data. We illustrate the approach with an example and conduct simulations to evaluate the performance of the method in terms of estimation efficiency, convergence, and parameter recovery. The results suggest that the second‐order Laplace approximation achieves a higher convergence rate and produces accurate yet fast parameter estimates compared to the first‐order Laplace approximation, while the time cost increases with higher model complexity. Additionally, models that consider the dependence of variables from the same stimulus fit the empirical data substantially better than models that disregarded the dependence. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. A Review of Generalized Linear Latent Variable Models and Related Computational Approaches.
- Author
-
Korhonen, Pekka, Nordhausen, Klaus, and Taskinen, Sara
- Subjects
- *
BIOTIC communities , *FACTOR analysis , *DATA analysis , *MARKOV chain Monte Carlo , *ORDINATION - Abstract
Generalized linear latent variable models (GLLVMs) have become mainstream models in this analysis of correlated, m‐dimensional data. GLLVMs can be seen as a reduced‐rank version of generalized linear mixed models (GLMMs) as the latent variables which are of dimension p≪m$$ p\ll m $$ induce a reduced‐rank covariance structure for the model. Models are flexible and can be used for various purposes, including exploratory analysis, that is, ordination analysis, estimating patterns of residual correlation, multivariate inference about measured predictors, and prediction. Recent advances in computational tools allow the development of efficient, scalable algorithms for fitting GLLMVs for any response distribution. In this article, we discuss the basics of GLLVMs and review some options for model fitting. We focus on methods that are based on likelihood inference. The implementations available in R are compared via simulation studies and an example illustrates how GLLVMs can be applied as an exploratory tool in the analysis of data from community ecology. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Marginal inference for hierarchical generalized linear mixed models with patterned covariance matrices using the Laplace approximation.
- Author
-
Ver Hoef, Jay M., Blagg, Eryn, Dumelle, Michael, Dixon, Philip M., Zimmerman, Dale L., and Conn, Paul B.
- Subjects
COVARIANCE matrices ,AUTOMATIC differentiation ,TIME series analysis ,STATISTICS ,FORECASTING - Abstract
We develop hierarchical models and methods in a fully parametric approach to generalized linear mixed models for any patterned covariance matrix. The Laplace approximation is used to marginally estimate covariance parameters by integrating over all fixed and latent random effects. The Laplace approximation relies on Newton–Raphson updates, which also leads to predictions for the latent random effects. We develop methodology for complete marginal inference, from estimating covariance parameters and fixed effects to making predictions for unobserved data. The marginal likelihood is developed for six distributions that are often used for binary, count, and positive continuous data, and our framework is easily extended to other distributions. We compare our methods to fully Bayesian methods, automatic differentiation, and integrated nested Laplace approximations (INLA) for bias, mean‐squared (prediction) error, and interval coverage, and all methods yield very similar results. However, our methods are much faster than Bayesian methods, and more general than INLA. Examples with binary and proportional data, count data, and positive‐continuous data are used to illustrate all six distributions with a variety of patterned covariance structures that include spatial models (both geostatistical and areal models), time series models, and mixtures with typical random intercepts based on grouping. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. An optimal Bayesian strategy for comparing Wiener–Hunt deconvolution models in the absence of ground truth.
- Author
-
Harroué, B, Giovannelli, J-F, and Pereyra, M
- Subjects
- *
GIBBS sampling , *MEASUREMENT errors , *IMAGE reconstruction , *BIG data , *COVARIANCE matrices - Abstract
This paper considers the quantitative comparison of several alternative models to perform deconvolution in situations where there is no ground truth data available. With applications to very large data sets in mind, we focus on linear deconvolution models based on a Wiener filter. Although comparatively simple, such models are widely prevalent in large scale setting such as high-resolution image restoration because they provide an excellent trade-off between accuracy and computational effort. However, in order to deliver accurate solutions, the models need to be properly calibrated in order to capture the covariance structure of the unknown quantity of interest and of the measurement error. This calibration often requires onerous controlled experiments and extensive expert supervision, as well as regular recalibration procedures. This paper adopts an unsupervised Bayesian statistical approach to model assessment that allows comparing alternative models by using only the observed data, without the need for ground truth data or controlled experiments. Accordingly, the models are quantitatively compared based on their posterior probabilities given the data, which are derived from the marginal likelihoods or evidences of the models. The computation of these evidences is highly non-trivial and this paper consider three different strategies to address this difficulty—a Chib approach, Laplace approximations, and a truncated harmonic expectation—all of which efficiently implemented by using a Gibbs sampling algorithm specialised for this class of models. In addition to enabling unsupervised model selection, the output of the Gibbs sampler can also be used to automatically estimate unknown model parameters such as the variance of the measurement error and the power of the unknown quantity of interest. The proposed strategies are demonstrated on a range of image deconvolution problems, where they are used to compare different modelling choices for the instrument's point spread function and covariance matrices for the unknown image and for the measurement error. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Sparse Bayesian learning using TMB (Template Model Builder)
- Author
-
Helgøy, Ingvild M., Skaug, Hans J., and Li, Yushu
- Abstract
Sparse Bayesian Learning, and more specifically the Relevance Vector Machine (RVM), can be used in supervised learning for both classification and regression problems. Such methods are particularly useful when applied to big data in order to find a sparse (in weight space) representation of the model. This paper demonstrates that the Template Model Builder (TMB) is an accurate and flexible computational framework for implementation of sparse Bayesian learning methods.The user of TMB is only required to specify the joint likelihood of the weights and the data, while the Laplace approximation of the marginal likelihood is automatically evaluated to numerical precision. This approximation is in turn used to estimate hyperparameters by maximum marginal likelihood. In order to reduce the computational cost of the Laplace approximation we introduce the notion of an “active set” of weights, and we devise an algorithm for dynamically updating this set until convergence, similar to what is done in other RVM type methods. We implement two different methods using TMB; the RVM and the Probabilistic Feature Selection and Classification Vector Machine method, where the latter also performs feature selection. Experiments based on benchmark data show that our TMB implementation performs comparable to that of the original implementation, but at a lower implementation cost. TMB can also calculate model and prediction uncertainty, by including estimation uncertainty from both latent variables and the hyperparameters. In conclusion, we find that TMB is a flexible tool that facilitates implementation and prototyping of sparse Bayesian methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Semi-parametric benchmark dose analysis with monotone additive models.
- Author
-
Stringer, Alex, Akkaya Hocagil, Tugba, Cook, Richard J, Ryan, Louise M, Jacobson, Sandra W, and Jacobson, Joseph L
- Subjects
- *
PRENATAL alcohol exposure , *CONFIDENCE intervals , *NEWTON-Raphson method , *NONLINEAR equations , *LONGITUDINAL method - Abstract
Benchmark dose analysis aims to estimate the level of exposure to a toxin associated with a clinically significant adverse outcome and quantifies uncertainty using the lower limit of a confidence interval for this level. We develop a novel framework for benchmark dose analysis based on monotone additive dose-response models. We first introduce a flexible approach for fitting monotone additive models via penalized B-splines and Laplace-approximate marginal likelihood. A reflective Newton method is then developed that employs de Boor's algorithm for computing splines and their derivatives for efficient estimation of the benchmark dose. Finally, we develop a novel approach for calculating benchmark dose lower limits based on an approximate pivot for the nonlinear equation solved by the estimated benchmark dose. The favorable properties of this approach compared to the Delta method and a parameteric bootstrap are discussed. We apply the new methods to make inferences about the level of prenatal alcohol exposure associated with clinically significant cognitive defects in children using data from six NIH-funded longitudinal cohort studies. Software to reproduce the results in this paper is available online and makes use of the novel semibmd R package, which implements the methods in this paper. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Fast and scalable inference for spatial extreme value models.
- Author
-
Chen, Meixi, Ramezan, Reza, and Lysy, Martin
- Subjects
- *
DISTRIBUTION (Probability theory) , *MARKOV chain Monte Carlo , *EXTREME weather , *GAUSSIAN processes , *EXTREME value theory - Abstract
The generalized extreme value (GEV) distribution is a popular model for analyzing and forecasting extreme weather data. To increase prediction accuracy, spatial information is often pooled via a latent Gaussian process (GP) on the GEV parameters. Inference for GEV‐GP models is typically carried out using Markov Chain Monte Carlo (MCMC) methods, or using approximate inference methods such as the integrated nested Laplace approximation (INLA). However, MCMC becomes prohibitively slow as the number of spatial locations increases, whereas INLA is applicable in practice only to a limited subset of GEV‐GP models. In this article, we revisit the original Laplace approximation for fitting spatial GEV models. In combination with a popular sparsity‐inducing spatial covariance approximation technique, we show through simulations that our approach accurately estimates the Bayesian predictive distribution of extreme weather events, is scalable to several thousand spatial locations, and is several orders of magnitude faster than MCMC. A case study in forecasting extreme snowfall across Canada is presented. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Bayesian joint models for multi-regional clinical trials.
- Author
-
Bean, Nathan W, Ibrahim, Joseph G, and Psioda, Matthew A
- Subjects
- *
CLINICAL trials , *DRUG development , *SURVIVAL analysis (Biometry) , *TREATMENT effectiveness , *LAPLACE distribution , *SAMPLE size (Statistics) , *PHARMACEUTICAL industry - Abstract
In recent years, multi-regional clinical trials (MRCTs) have increased in popularity in the pharmaceutical industry due to their ability to accelerate the global drug development process. To address potential challenges with MRCTs, the International Council for Harmonisation released the E17 guidance document which suggests the use of statistical methods that utilize information borrowing across regions if regional sample sizes are small. We develop an approach that allows for information borrowing via Bayesian model averaging in the context of a joint analysis of survival and longitudinal data from MRCTs. In this novel application of joint models to MRCTs, we use Laplace's method to integrate over subject-specific random effects and to approximate posterior distributions for region-specific treatment effects on the time-to-event outcome. Through simulation studies, we demonstrate that the joint modeling approach can result in an increased rejection rate when testing the global treatment effect compared with methods that analyze survival data alone. We then apply the proposed approach to data from a cardiovascular outcomes MRCT. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. An evaluation of computational methods for aggregate data meta-analyses of diagnostic test accuracy studies
- Author
-
Yixin Zhao, Bilal Khan, and Zelalem F. Negeri
- Subjects
Meta-analysis ,Diagnostic test accuracy ,Generalized linear mixed models ,Computational methods ,Adaptive Gauss-Hermite ,Laplace approximation ,Medicine (General) ,R5-920 - Abstract
Abstract Background A Generalized Linear Mixed Model (GLMM) is recommended to meta-analyze diagnostic test accuracy studies (DTAs) based on aggregate or individual participant data. Since a GLMM does not have a closed-form likelihood function or parameter solutions, computational methods are conventionally used to approximate the likelihoods and obtain parameter estimates. The most commonly used computational methods are the Iteratively Reweighted Least Squares (IRLS), the Laplace approximation (LA), and the Adaptive Gauss-Hermite quadrature (AGHQ). Despite being widely used, it has not been clear how these computational methods compare and perform in the context of an aggregate data meta-analysis (ADMA) of DTAs. Methods We compared and evaluated the performance of three commonly used computational methods for GLMM - the IRLS, the LA, and the AGHQ, via a comprehensive simulation study and real-life data examples, in the context of an ADMA of DTAs. By varying several parameters in our simulations, we assessed the performance of the three methods in terms of bias, root mean squared error, confidence interval (CI) width, coverage of the 95% CI, convergence rate, and computational speed. Results For most of the scenarios, especially when the meta-analytic data were not sparse (i.e., there were no or negligible studies with perfect diagnosis), the three computational methods were comparable for the estimation of sensitivity and specificity. However, the LA had the largest bias and root mean squared error for pooled sensitivity and specificity when the meta-analytic data were sparse. Moreover, the AGHQ took a longer computational time to converge relative to the other two methods, although it had the best convergence rate. Conclusions We recommend practitioners and researchers carefully choose an appropriate computational algorithm when fitting a GLMM to an ADMA of DTAs. We do not recommend the LA for sparse meta-analytic data sets. However, either the AGHQ or the IRLS can be used regardless of the characteristics of the meta-analytic data.
- Published
- 2024
- Full Text
- View/download PDF
18. An evaluation of computational methods for aggregate data meta-analyses of diagnostic test accuracy studies.
- Author
-
Zhao, Yixin, Khan, Bilal, and Negeri, Zelalem F.
- Subjects
STANDARD deviations ,DIAGNOSIS methods ,EVALUATION methodology ,SENSITIVITY & specificity (Statistics) ,LEAST squares - Abstract
Background: A Generalized Linear Mixed Model (GLMM) is recommended to meta-analyze diagnostic test accuracy studies (DTAs) based on aggregate or individual participant data. Since a GLMM does not have a closed-form likelihood function or parameter solutions, computational methods are conventionally used to approximate the likelihoods and obtain parameter estimates. The most commonly used computational methods are the Iteratively Reweighted Least Squares (IRLS), the Laplace approximation (LA), and the Adaptive Gauss-Hermite quadrature (AGHQ). Despite being widely used, it has not been clear how these computational methods compare and perform in the context of an aggregate data meta-analysis (ADMA) of DTAs. Methods: We compared and evaluated the performance of three commonly used computational methods for GLMM - the IRLS, the LA, and the AGHQ, via a comprehensive simulation study and real-life data examples, in the context of an ADMA of DTAs. By varying several parameters in our simulations, we assessed the performance of the three methods in terms of bias, root mean squared error, confidence interval (CI) width, coverage of the 95% CI, convergence rate, and computational speed. Results: For most of the scenarios, especially when the meta-analytic data were not sparse (i.e., there were no or negligible studies with perfect diagnosis), the three computational methods were comparable for the estimation of sensitivity and specificity. However, the LA had the largest bias and root mean squared error for pooled sensitivity and specificity when the meta-analytic data were sparse. Moreover, the AGHQ took a longer computational time to converge relative to the other two methods, although it had the best convergence rate. Conclusions: We recommend practitioners and researchers carefully choose an appropriate computational algorithm when fitting a GLMM to an ADMA of DTAs. We do not recommend the LA for sparse meta-analytic data sets. However, either the AGHQ or the IRLS can be used regardless of the characteristics of the meta-analytic data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. CONSENSUS-BASED RARE EVENT ESTIMATION.
- Author
-
ALTHAUS, KONSTANTIN, PAPAIOANNOU, IASON, and ULLMANN, ELISABETH
- Subjects
- *
INVERSE problems , *EULER method , *KALMAN filtering - Abstract
In this paper, we introduce a new algorithm for rare event estimation based on adaptive importance sampling. We consider a smoothed version of the optimal importance sampling density, which is approximated by an ensemble of interacting particles. The particle dynamics is governed by a McKean--Vlasov stochastic differential equation, which was introduced and analyzed in [Carrillo et al., Stud. Appl. Math., 148 (2022), pp. 1069-1140] for consensus-based sampling and optimization of posterior distributions arising in the context of Bayesian inverse problems. We develop automatic updates for the internal parameters of our algorithm. This includes a novel time step size controller for the exponential Euler method, which discretizes the particle dynamics. The behavior of all parameter updates depends on easy to interpret accuracy criteria specified by the user. We show in numerical experiments that our method is competitive to state-of-the-art adaptive importance sampling algorithms for rare event estimation, namely a sequential importance sampling method and the ensemble Kalman filter for rare event estimation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. A multinomial generalized linear mixed model for clustered competing risks data.
- Author
-
Laureano, Henrique Aparecido, Petterle, Ricardo Rasmussen, Silva, Guilherme Parreira da, Ribeiro Junior, Paulo Justiniano, and Bonat, Wagner Hugo
- Subjects
- *
COMPETING risks , *AUTOMATIC differentiation , *MULTINOMIAL distribution , *LATENT structure analysis , *GAUSSIAN distribution , *C++ - Abstract
Clustered competing risks data are a complex failure time data scheme. Its main characteristics are the cluster structure, which implies a latent within-cluster dependence between its elements, and its multiple variables competing to be the one responsible for the occurrence of an event, the failure. To handle this kind of data, we propose a full likelihood approach, based on generalized linear mixed models instead the usual complex frailty model. We model the competing causes in the probability scale, in terms of the cumulative incidence function (CIF). A multinomial distribution is assumed for the competing causes and censorship, conditioned on the latent effects that are accommodated by a multivariate Gaussian distribution. The CIF is specified as the product of an instantaneous risk level function with a failure time trajectory level function. The estimation procedure is performed through the R package Template Model Builder, an C++ based framework with efficient Laplace approximation and automatic differentiation routines. A large simulation study was performed, based on different latent structure formulations. The model fitting was challenging and our results indicated that a latent structure where both risk and failure time trajectory levels are correlated is required to reach reasonable estimation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Generative models and Bayesian inversion using Laplace approximation.
- Author
-
Marschall, Manuel, Wübbeler, Gerd, Schmähling, Franko, and Elster, Clemens
- Subjects
- *
PROBABILISTIC generative models , *DATA distribution , *MAGNETIC resonance imaging , *PROBABILITY density function , *INVERSE problems - Abstract
The Bayesian approach to solving inverse problems relies on the choice of a prior. This critical ingredient allows expert knowledge or physical constraints to be formulated in a probabilistic fashion and plays an important role for the success of the inference. Recently, Bayesian inverse problems were solved using generative models as highly informative priors. Generative models are a popular tool in machine learning to generate data whose properties closely resemble those of a given database. Typically, the generated distribution of data is embedded in a low-dimensional manifold. For the inverse problem, a generative model is trained on a database that reflects the properties of the sought solution, such as typical structures of the tissue in the human brain in magnetic resonance imaging. The inference is carried out in the low-dimensional manifold determined by the generative model that strongly reduces the dimensionality of the inverse problem. However, this procedure produces a posterior that does not admit a Lebesgue density in the actual variables and the accuracy attained can strongly depend on the quality of the generative model. For linear Gaussian models, we explore an alternative Bayesian inference based on probabilistic generative models; this inference is carried out in the original high-dimensional space. A Laplace approximation is employed to analytically derive the prior probability density function required, which is induced by the generative model. Properties of the resulting inference are investigated. Specifically, we show that derived Bayes estimates are consistent, in contrast to the approach in which the low-dimensional manifold of the generative model is employed. The MNIST data set is used to design numerical experiments that confirm our theoretical findings. It is shown that the approach proposed can be advantageous when the information contained in the data is high and a simple heuristic is considered for the detection of this case. Finally, the pros and cons of both approaches are discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Chi-Square Approximation for the Distribution of Individual Eigenvalues of a Singular Wishart Matrix.
- Author
-
Shimizu, Koki and Hashiguchi, Hiroki
- Subjects
- *
CHI-square distribution , *WISHART matrices , *EIGENVALUES , *DEGREES of freedom , *MATRIX functions - Abstract
This paper discusses the approximate distributions of eigenvalues of a singular Wishart matrix. We give the approximate joint density of eigenvalues by Laplace approximation for the hypergeometric functions of matrix arguments. Furthermore, we show that the distribution of each eigenvalue can be approximated by the chi-square distribution with varying degrees of freedom when the population eigenvalues are infinitely dispersed. The derived result is applied to testing the equality of eigenvalues in two populations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. A Bayesian survival treed hazards model using latent Gaussian processes.
- Author
-
Payne, Richard D, Guha, Nilabja, and Mallick, Bani K
- Subjects
- *
GAUSSIAN processes , *MARKOV chain Monte Carlo , *PARTITION functions , *HAZARDS , *CIRRHOSIS of the liver - Abstract
Survival models are used to analyze time-to-event data in a variety of disciplines. Proportional hazard models provide interpretable parameter estimates, but proportional hazard assumptions are not always appropriate. Non-parametric models are more flexible but often lack a clear inferential framework. We propose a Bayesian treed hazards partition model that is both flexible and inferential. Inference is obtained through the posterior tree structure and flexibility is preserved by modeling the log-hazard function in each partition using a latent Gaussian process. An efficient reversible jump Markov chain Monte Carlo algorithm is accomplished by marginalizing the parameters in each partition element via a Laplace approximation. Consistency properties for the estimator are established. The method can be used to help determine subgroups as well as prognostic and/or predictive biomarkers in time-to-event data. The method is compared with some existing methods on simulated data and a liver cirrhosis dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. A fast method for fitting integrated species distribution models
- Author
-
Elliot Dovers, Gordana C. Popovic, and David I. Warton
- Subjects
data fusion ,data integration ,ecology ,Laplace approximation ,log‐Gaussian Cox process ,presence/absence data ,Ecology ,QH540-549.5 ,Evolution ,QH359-425 - Abstract
Abstract Integrated distribution models (IDMs) predict where species might occur using data from multiple sources, a technique thought to be especially useful when data from any individual source are scarce. Recent advances allow us to fit such models with latent terms to account for dependence within and between data sources, but they are computationally challenging to fit. We propose a fast new methodology for fitting integrated distribution models using presence/absence and presence‐only data, via a spatial random effects approach combined with automatic differentiation. We have written an R package (called scampr) for straightforward implementation of our approach. We use simulation to demonstrate that our approach has comparable performance to INLA—a common framework for fitting IDMs—but with computation times up to an order of magnitude faster. We also use simulation to look at when IDMs can be expected to outperform models fitted to a single data source, and find that the amount of benefit gained from using an IDM is a function of the relative amount of additional information available from incorporating a second data source into the model. We apply our method to predict 29 plant species in NSW, Australia, and find particular benefit in predictive performance when data from a single source are scarce and when compared to models for presence‐only data. Our faster methods of fitting IDMs make it feasible to more deeply explore the model space (e.g. comparing different ways to model latent terms), and in future work, to consider extensions to more complex models, for example the multi‐species setting.
- Published
- 2024
- Full Text
- View/download PDF
25. Sex and diel period influence patterns of resource selection in elk.
- Author
-
Padilla, Benjamin J., Banfield, Jeremiah E., and Larkin, Jeffery L.
- Subjects
- *
ELK , *WILDLIFE management , *ANIMAL ecology , *EXTRATERRESTRIAL resources , *HABITAT partitioning (Ecology) , *ECOLOGISTS - Abstract
Resource selection and space use are important aspects of an animal's ecology and understanding these behaviors is necessary for proper wildlife management. We used mixed‐effect integrated step‐selection models to evaluate seasonal variation in resource selection between male and female elk (Cervus canadensis) and diel periods in central Pennsylvania, USA. Resource selection varied seasonally, between sexes, and across diel periods. These results demonstrate strong seasonal sexual segregation in resource use, and movements between habitats throughout the day, highlighting the dynamic nature of resource selection by elk and underscoring the importance of considering sexual variation at multiple temporal scales when designing ungulate management strategies. Finally, we developed habitat suitability maps for male and female elk in the Pennsylvania Elk Management Area. Wildlife ecologists and managers must consider multiple sources of variation in habitat use and resource selection, particularly for large mobile species such as elk. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. A fast method for fitting integrated species distribution models.
- Author
-
Dovers, Elliot, Popovic, Gordana C., and Warton, David I.
- Subjects
SPECIES distribution ,AUTOMATIC differentiation ,PLANT species ,DATA modeling ,DATA integration - Abstract
Integrated distribution models (IDMs) predict where species might occur using data from multiple sources, a technique thought to be especially useful when data from any individual source are scarce. Recent advances allow us to fit such models with latent terms to account for dependence within and between data sources, but they are computationally challenging to fit.We propose a fast new methodology for fitting integrated distribution models using presence/absence and presence‐only data, via a spatial random effects approach combined with automatic differentiation. We have written an R package (called scampr) for straightforward implementation of our approach.We use simulation to demonstrate that our approach has comparable performance to INLA—a common framework for fitting IDMs—but with computation times up to an order of magnitude faster. We also use simulation to look at when IDMs can be expected to outperform models fitted to a single data source, and find that the amount of benefit gained from using an IDM is a function of the relative amount of additional information available from incorporating a second data source into the model. We apply our method to predict 29 plant species in NSW, Australia, and find particular benefit in predictive performance when data from a single source are scarce and when compared to models for presence‐only data.Our faster methods of fitting IDMs make it feasible to more deeply explore the model space (e.g. comparing different ways to model latent terms), and in future work, to consider extensions to more complex models, for example the multi‐species setting. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Bayesian design of multi‐regional clinical trials with time‐to‐event endpoints.
- Author
-
Bean, Nathan William, Ibrahim, Joseph George, and Psioda, Matthew Austin
- Subjects
- *
EXPERIMENTAL design , *PROPORTIONAL hazards models , *LOG-rank test , *LAPLACE distribution - Abstract
Sponsors often rely on multi‐regional clinical trials (MRCTs) to introduce new treatments more rapidly into the global market. Many commonly used statistical methods do not account for regional differences, and small regional sample sizes frequently result in lower estimation quality of region‐specific treatment effects. The International Council for Harmonization E17 guidelines suggest consideration of methods that allow for information borrowing across regions to improve estimation. In response to these guidelines, we develop a novel methodology to estimate global and region‐specific treatment effects from MRCTs with time‐to‐event endpoints using Bayesian model averaging (BMA). This approach accounts for the possibility of heterogeneous treatment effects between regions, and we discuss how to assess the consistency of these effects using posterior model probabilities. We obtain posterior samples of the treatment effects using a Laplace approximation, and we show through simulation studies that the proposed modeling approach estimates region‐specific treatment effects with lower mean squared error than a Cox proportional hazards model while resulting in a similar rejection rate of the global treatment effect. We then apply the BMA approach to data from the LEADER trial, an MRCT designed to evaluate the cardiovascular safety of an anti‐diabetic treatment. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
28. Reduction of testing effort for fatigue tests: Application of Bayesian optimal experimental design.
- Author
-
Frie, Christian, Kolyshkin, Anton, Mordeja, Sven, Riza Durmaz, Ali, and Eberl, Chris
- Subjects
- *
OPTIMAL designs (Statistics) , *FATIGUE testing machines , *FATIGUE limit , *GAUSSIAN distribution - Abstract
S‐N curve parameter determination is a time‐ and cost‐intensive procedure. A standardized method for simultaneously determining all S‐N curve parameters with minimum testing effort is still missing. The Bayesian optimal experimental design (BOED) approach can reduce testing effort and accelerates uncertainty reduction during fatigue testing for S‐N curve parameters. The concept is applicable to all S‐N curve models and is exemplary illustrated for a bilinear S‐N curve model. We demonstrate the fatigue testing workflow for the bilinear S‐N curve in detail while discussing steps and challenges when generalizing to other S‐N curve models. Applying the BOED to the bilinear S‐N curve models, minor errors and uncertainties for all S‐N curve parameters are obtained after only 10 experiments for data scatter values below 1.1. For such, the relative error in fatigue limit estimation was less than 1% after five tests. When S‐N data scatter higher than 1.2 is concerned, 17 tests were required for robust analysis. The BOED methodology should be applied to other S‐N curve models in the future. The high computational effort and the approximation of the posterior distribution with a normal distribution are the limitations of the presented BOED approach. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Comparing estimation approaches for generalized additive mixed models with binary outcomes.
- Author
-
Mullah, Muhammad Abu Shadeque, Hossain, Zakir, and Benedetti, Andrea
- Subjects
- *
MULTICOLLINEARITY , *CURVE fitting , *MARKOV chain Monte Carlo , *DEMOGRAPHIC surveys - Abstract
Generalized additive mixed models (GAMMs) extend generalized linear mixed models (GLMMs) to allow the covariates to be nonparametrically associated with the response. Estimation of such models for correlated binary data is challenging and estimation techniques often yield contrasting results. Via simulations, we compared the performance of the Bayesian and likelihood-based methods for estimating the components of GAMMs under a wide range of conditions. For the Bayesian method, we also assessed the sensitivity of the results to the choice of prior distributions of the variance components. In addition, we investigated the effect of multicollinearity among covariates on the estimation of the model components. We then applied the methods to the Bangladesh Demographic Health Survey data to identify the factors associated with the malnutrition of children in Bangladesh. While no method uniformly performed best in estimating all components of the model, the Bayesian method using half-Cauchy priors for variance components generally performed better, especially for small cluster size. The overall curve fitting performance was sensitive to the prior selection for the Bayesian methods and also to the extent of multicollinearity. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. Laplace based Bayesian inference for ordinary differential equation models using regularized artificial neural networks.
- Author
-
Kwok, Wai M., Streftaris, George, and Dass, Sarat C.
- Abstract
Parameter estimation and associated uncertainty quantification is an important problem in dynamical systems characterised by ordinary differential equation (ODE) models that are often nonlinear. Typically, such models have analytically intractable trajectories which result in likelihoods and posterior distributions that are similarly intractable. Bayesian inference for ODE systems via simulation methods require numerical approximations to produce inference with high accuracy at a cost of heavy computational power and slow convergence. At the same time, Artificial Neural Networks (ANN) offer tractability that can be utilized to construct an approximate but tractable likelihood and posterior distribution. In this paper we propose a hybrid approach, where Laplace-based Bayesian inference is combined with an ANN architecture for obtaining approximations to the ODE trajectories as a function of the unknown initial values and system parameters. Suitable choices of customized loss functions are proposed to fine tune the approximated ODE trajectories and the subsequent Laplace approximation procedure. The effectiveness of our proposed methods is demonstrated using an epidemiological system with non-analytical solutions—the Susceptible-Infectious-Removed (SIR) model for infectious diseases—based on simulated and real-life influenza datasets. The novelty and attractiveness of our proposed approach include (i) a new development of Bayesian inference using ANN architectures for ODE based dynamical systems, and (ii) a computationally fast posterior inference by avoiding convergence issues of benchmark Markov Chain Monte Carlo methods. These two features establish the developed approach as an accurate alternative to traditional Bayesian computational methods, with improved computational cost. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. Introduction to Generalized Linear Mixed Models
- Author
-
Xia, Yinglin, Sun, Jun, Xia, Yinglin, and Sun, Jun
- Published
- 2023
- Full Text
- View/download PDF
32. Asymptotic behavior of the distributions of eigenvalues for beta-Wishart ensemble under the dispersed population eigenvalues.
- Author
-
Nasuda, Ryo, Shimizu, Koki, and Hashiguchi, Hiroki
- Subjects
- *
ASYMPTOTIC distribution , *EIGENVALUES , *MONTE Carlo method , *GAMMA distributions , *HYPERGEOMETRIC functions , *GAMMA functions - Abstract
We propose a Laplace approximation of the hypergeometric function with two matrix arguments expanded by Jack polynomials. This type of hypergeometric function appears in the joint density of eigenvalues of the beta-Wishart matrix for parameters β = 1 , 2 , 4 , where the matrix indicates the cases for reals, complexes, and quaternions, respectively. Using the Laplace approximations, we show that the joint density of the eigenvalues can be expressed using gamma density functions when population eigenvalues are infinitely dispersed. In general, for the parameter β > 0 , we also show that the distribution of the eigenvalue can be approximated by gamma distributions through broken arrow matrices. We compare approximated gamma distributions with empirical distributions by Monte Carlo simulation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. Maximum Likelihood Algorithm for Spatial Generalized Linear Mixed Models without Numerical Evaluations of Intractable Integrals.
- Author
-
Zhang, Tonglin
- Subjects
- *
EXPECTATION-maximization algorithms , *RANDOM effects model , *INTEGRALS , *GAUSSIAN distribution , *ALGORITHMS - Abstract
Spatial generalized linear mixed effects models are popular in spatial or spatiotemporal data analysis when the responses are counts and the random effects are modeled by multivariate normal distributions. Direct computation of the MLEs of model parameters is impossible because the likelihood functions contain high-dimensional intractable integrals. To overcome the difficulty, a new method called the prediction-maximization algorithm is proposed. The method has a maximization step for the MLEs of spatial linear mixed effects models for normal responses and a prediction step for the prediction of the random effects. None of them involves high-dimensional intractable integrals. Because only algorithms for the normal responses are needed, the derivation of the MLEs of a spatial generalized linear mixed effects model for count responses by the proposed method is not computationally harder than a model for normal responses. The simulation study shows that the performance of the proposed method is comparable to that of the previous maximum likelihood algorithms formulated by high-order Laplace approximations and is better than that of Bayesian methods formulated by MCMC algorithms. for this article are available online. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
34. Penalty parameter selection and asymmetry corrections to Laplace approximations in Bayesian P-splines models.
- Author
-
Lambert, Philippe and Gressani, Oswaldo
- Subjects
- *
LAPLACE distribution , *GAUSSIAN Markov random fields , *MARKOV chain Monte Carlo , *SPLINES - Abstract
Laplace P-splines (LPS) combine the P-splines smoother and the Laplace approximation in a unifying framework for fast and flexible inference under the Bayesian paradigm. The Gaussian Markov random field prior assumed for penalized parameters and the Bernstein-von Mises theorem typically ensure a razor-sharp accuracy of the Laplace approximation to the posterior distribution of these quantities. This accuracy can be seriously compromised for some unpenalized parameters, especially when the information synthesized by the prior and the likelihood is sparse. Therefore, we propose a refined version of the LPS methodology by splitting the parameter space in two subsets. The first set involves parameters for which the joint posterior distribution is approached from a non-Gaussian perspective with an approximation scheme tailored to capture asymmetric patterns, while the posterior distribution for the penalized parameters in the complementary set undergoes the LPS treatment with Laplace approximations. As such, the dichotomization of the parameter space provides the necessary structure for a separate treatment of model parameters, yielding improved estimation accuracy as compared to a setting where posterior quantities are uniformly handled with Laplace. In addition, the proposed enriched version of LPS remains entirely sampling-free, so that it operates at a computing speed that is far from reach to any existing Markov chain Monte Carlo approach. The methodology is illustrated on the additive proportional odds model with an application on ordinal survey data. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. Multivariate generalized linear mixed models for underdispersed count data.
- Author
-
da Silva, Guilherme Parreira, Laureano, Henrique Aparecido, Petterle, Ricardo Rasmussen, Ribeiro Jr, Paulo Justiniano, and Bonat, Wagner Hugo
- Subjects
- *
COVARIANCE matrices , *POISSON regression , *NATIONAL Health & Nutrition Examination Survey , *AUTOMATIC differentiation , *REGRESSION analysis , *AKAIKE information criterion , *MAXIMUM likelihood statistics - Abstract
Researchers are often interested in understanding the relationship between a set of covariates and a set of response variables. To achieve this goal, the use of regression analysis, either linear or generalized linear models, is largely applied. However, such models only allow users to model one response variable at a time. Moreover, it is not possible to directly calculate from the regression model a correlation measure between the response variables. In this article, we employed the Multivariate Generalized Linear Mixed Models framework, which allows the specification of a set of response variables and calculates the correlation between them through a random effect structure that follows a multivariate normal distribution. We used the maximum likelihood estimation framework to estimate all model parameters using Laplace approximation to integrate out the random effects. The derivatives are provided by automatic differentiation. The outer maximization was made using a general-purpose algorithm such as PORT and Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS). We delimited this problem by studying count response variables with the following distributions: Poisson, negative binomial, Conway-Maxwell-Poisson (COM-Poisson), and double Poisson. While the first distribution can model only equidispersed data, the second models equi and overdispersed, and the third and fourth models all types of dispersion (i.e. including underdispersion). The models were implemented on software R with package TMB, based on C++ templates. Besides the full specification, models with simpler structures in the covariance matrix were considered (fixed and common variance, and ρ set to 0) and fixed dispersion. These models were applied to a dataset from the National Health and Nutrition Examination Survey, where two response variables are underdispersed and one can be considered equidispersed that were measured at 1281 subjects. The double Poisson full model specification overcame the other three competitors considering three goodness-of-fit measures: Akaike Information Criteria (AIC), Bayesian Information Criteria (BIC), and maximized log-likelihood. Consequently, it estimated parameters with smaller standard error and a greater number of significant correlation coefficients. Therefore, the proposed model can deal with multivariate count responses and measures the correlation between them taking into account the effects of the covariates. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. Profile Likelihood for Hierarchical Models Using Data Doubling.
- Author
-
Lele, Subhash R.
- Subjects
- *
MATHEMATICAL proofs , *MAXIMUM likelihood statistics , *FREQUENTIST statistics , *INFERENTIAL statistics , *CONSTRAINED optimization , *HIERARCHICAL Bayes model - Abstract
In scientific problems, an appropriate statistical model often involves a large number of canonical parameters. Often times, the quantities of scientific interest are real-valued functions of these canonical parameters. Statistical inference for a specified function of the canonical parameters can be carried out via the Bayesian approach by simply using the posterior distribution of the specified function of the parameter of interest. Frequentist inference is usually based on the profile likelihood for the parameter of interest. When the likelihood function is analytical, computing the profile likelihood is simply a constrained optimization problem with many numerical algorithms available. However, for hierarchical models, computing the likelihood function and hence the profile likelihood function is difficult because of the high-dimensional integration involved. We describe a simple computational method to compute profile likelihood for any specified function of the parameters of a general hierarchical model using data doubling. We provide a mathematical proof for the validity of the method under regularity conditions that assure that the distribution of the maximum likelihood estimator of the canonical parameters is non-singular, multivariate, and Gaussian. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
37. State‐space models for ecological time‐series data: Practical model‐fitting
- Author
-
Ken Newman, Ruth King, Víctor Elvira, Perry deValpine, Rachel S. McCrea, and Byron J. T. Morgan
- Subjects
hidden Markov model ,Kalman filter ,Laplace approximation ,likelihood‐free methods ,Markov chain Monte Carlo ,sampling‐based methods ,Ecology ,QH540-549.5 ,Evolution ,QH359-425 - Abstract
Abstract State‐space models are an increasingly common and important tool in the quantitative ecologists’ armoury, particularly for the analysis of time‐series data. This is due to both their flexibility and intuitive structure, describing the different individual processes of a complex system, thus simplifying the model specification step. State‐space models are composed of two processes (a) the system (or state) process that describes the dynamics of the true underlying state of the system over time; and (b) the observation process that links the observed data with the current true state of the system at that time. Specification of the general model structure consists of considering each distinct ecological process within the system and observation processes, which are then automatically combined within the state‐space structure. There is typically a trade‐off between the complexity of the model and the associated model‐fitting process. Simpler model specifications permit the application of simpler model‐fitting tools; whereas more complex model specifications, with nonlinear dynamics and/or non‐Gaussian stochasticity often require more sophisticated model‐fitting algorithms to be applied. We provide a brief overview of general state‐space models before focusing on the different model‐fitting tools available. In particular for different general state‐space model structures we discuss established model‐fitting tools that are available. We also offer practical guidance for choosing a specific fitting procedure.
- Published
- 2023
- Full Text
- View/download PDF
38. A Statistical Review of Template Model Builder: A Flexible Tool for Spatial Modelling.
- Author
-
Osgood‐Zimmerman, Aaron and Wakefield, Jon
- Subjects
- *
STOCHASTIC partial differential equations , *GAUSSIAN Markov random fields , *RANDOM effects model - Abstract
Summary: The integrated nested Laplace approximation (INLA) is a well‐known and popular technique for spatial modelling with a user‐friendly interface in the R‐INLA package. Unfortunately, only a certain class of latent Gaussian models are amenable to fitting with INLA. In this paper, we review template model builder (TMB), an existing technique and software package which is well‐suited to fitting complex spatio‐temporal models. TMB is relatively unknown to the spatial statistics community, but it is a flexible random effects modelling tool which allows users to define customizable and complex mixed effects models through C++ templates. After contrasting the methodology behind TMB with INLA, we provide a large‐scale simulation study assessing and comparing R‐INLA and TMB for continuous spatial models, fitted via the stochastic partial differential equations (SPDE) approximation. The results show that the predictive fields from both methods are comparable in most situations even though TMB estimates for fixed or random effects may have slightly larger bias than R‐INLA. We also present a smaller discrete spatial simulation study, in which both approaches perform well. We conclude with a joint analysis of breast cancer incidence and mortality data implemented in TMB which requires a model which cannot be fit with R‐INLA. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. An approximate Bayesian approach for estimation of the instantaneous reproduction number under misreported epidemic data.
- Author
-
Gressani, Oswaldo, Faes, Christel, and Hens, Niel
- Abstract
In epidemic models, the effective reproduction number is of central importance to assess the transmission dynamics of an infectious disease and to orient health intervention strategies. Publicly shared data during an outbreak often suffers from two sources of misreporting (underreporting and delay in reporting) that should not be overlooked when estimating epidemiological parameters. The main statistical challenge in models that intrinsically account for a misreporting process lies in the joint estimation of the time‐varying reproduction number and the delay/underreporting parameters. Existing Bayesian approaches typically rely on Markov chain Monte Carlo algorithms that are extremely costly from a computational perspective. We propose a much faster alternative based on Laplacian‐P‐splines (LPS) that combines Bayesian penalized B‐splines for flexible and smooth estimation of the instantaneous reproduction number and Laplace approximations to selected posterior distributions for fast computation. Assuming a known generation interval distribution, the incidence at a given calendar time is governed by the epidemic renewal equation and the delay structure is specified through a composite link framework. Laplace approximations to the conditional posterior of the spline vector are obtained from analytical versions of the gradient and Hessian of the log‐likelihood, implying a drastic speed‐up in the computation of posterior estimates. Furthermore, the proposed LPS approach can be used to obtain point estimates and approximate credible intervals for the delay and reporting probabilities. Simulation of epidemics with different combinations for the underreporting rate and delay structure (one‐day, two‐day, and weekend delays) show that the proposed LPS methodology delivers fast and accurate estimates outperforming existing methods that do not take into account underreporting and delay patterns. Finally, LPS is illustrated in two real case studies of epidemic outbreaks. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. Federated learning algorithms for generalized mixed-effects model (GLMM) on horizontally partitioned data from distributed sources
- Author
-
Wentao Li, Jiayi Tong, Md. Monowar Anjum, Noman Mohammed, Yong Chen, and Xiaoqian Jiang
- Subjects
GLMM ,Federated learning ,Mixed effects ,Laplace approximation ,Gauss–Hermite approximation ,Computer applications to medicine. Medical informatics ,R858-859.7 - Abstract
Abstract Objectives This paper developed federated solutions based on two approximation algorithms to achieve federated generalized linear mixed effect models (GLMM). The paper also proposed a solution for numerical errors and singularity issues. And showed the two proposed methods can perform well in revealing the significance of parameter in distributed datasets, comparing to a centralized GLMM algorithm from R package (‘lme4’) as the baseline model. Methods The log-likelihood function of GLMM is approximated by two numerical methods (Laplace approximation and Gaussian Hermite approximation, abbreviated as LA and GH), which supports federated decomposition of GLMM to bring computation to data. To solve the numerical errors and singularity issues, the loss-less estimation of log-sum-exponential trick and the adaptive regularization strategy was used to tackle the problems caused by federated settings. Results Our proposed method can handle GLMM to accommodate hierarchical data with multiple non-independent levels of observations in a federated setting. The experiment results demonstrate comparable (LA) and superior (GH) performances with simulated and real-world data. Conclusion We modified and compared federated GLMMs with different approximations, which can support researchers in analyzing versatile biomedical data to accommodate mixed effects and address non-independence due to hierarchical structures (i.e., institutes, region, country, etc.).
- Published
- 2022
- Full Text
- View/download PDF
41. Unit gamma mixed regression models for continuous bounded data.
- Author
-
Petterle, Ricardo R., Taconeli, César A., da Silva, José L. P., da Silva, Guilherme P., Laureano, Henrique A., and Bonat, Wagner H.
- Subjects
- *
REGRESSION analysis , *DAM failures , *PERCENTILES , *AUTOMATIC differentiation , *MAXIMUM likelihood statistics , *DERIVATIVES (Mathematics) , *WATER quality - Abstract
We propose the unit gamma mixed regression model to deal with continuous bounded variables in the context of repeated measures and clustered data. The proposed model is based on the class of generalized linear mixed models and parameter estimates are obtained based on the maximum likelihood method. The computational implementation combines automatic differentiation and the Laplace approximation (via Template Model Builder/C++) to compute the derivatives of the log-likelihood function with respect to fixed and random effects parameters. We carry out extensive simulations to check the computational implementation and to verify the properties of the maximum likelihood estimators. Our results suggest that the proposed maximum likelihood approach provides unbiased and consistent estimators for all model parameters. The proposed model was motivated by two data sets. The first concerns the body fat percentage, where the goal was to investigate the effect of covariates which were taken in the same subject. The second data set refers to a water quality index data, where the main interest was to evaluate the effect of dams on the water quality measured on power plant reservoirs. The data sets and R code are provided as supplementary material. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Dynamic logistic state space prediction model for clinical decision making.
- Author
-
Jiang, Jiakun, Yang, Wei, Schnellinger, Erin M., Kimmel, Stephen E., and Guo, Wensheng
- Subjects
- *
DECISION making , *PREDICTION models , *LUNG transplantation , *LOGISTIC regression analysis , *SPLINES , *INFORMATION sharing - Abstract
Prediction modeling for clinical decision making is of great importance and needed to be updated frequently with the changes of patient population and clinical practice. Existing methods are either done in an ad hoc fashion, such as model recalibration or focus on studying the relationship between predictors and outcome and less so for the purpose of prediction. In this article, we propose a dynamic logistic state space model to continuously update the parameters whenever new information becomes available. The proposed model allows for both time‐varying and time‐invariant coefficients. The varying coefficients are modeled using smoothing splines to account for their smooth trends over time. The smoothing parameters are objectively chosen by maximum likelihood. The model is updated using batch data accumulated at prespecified time intervals, which allows for better approximation of the underlying binomial density function. In the simulation, we show that the new model has significantly higher prediction accuracy compared to existing methods. We apply the method to predict 1 year survival after lung transplantation using the United Network for Organ Sharing data. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Extremal characteristics of conditional models.
- Author
-
Tendijck, Stan, Tawn, Jonathan, and Jonathan, Philip
- Subjects
OCEAN engineering - Abstract
Conditionally specified models are often used to describe complex multivariate data. Such models assume implicit structures on the extremes. So far, no methodology exists for calculating extremal characteristics of conditional models since the copula and marginals are not expressed in closed forms. We consider bivariate conditional models that specify the distribution of X and the distribution of Y conditional on X. We provide tools to quantify implicit assumptions on the extremes of this class of models. In particular, these tools allow us to approximate the distribution of the tail of Y and the coefficient of asymptotic independence η in closed forms. We apply these methods to a widely used conditional model for wave height and wave period. Moreover, we introduce a new condition on the parameter space for the conditional extremes model of Heffernan and Tawn (Journal of the Royal Statistical Society: Series B (Methodology) 66(3), 497-547, 2004), and prove that the conditional extremes model does not capture η , when η < 1 . [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. MM ALGORITHMS FOR VARIANCE COMPONENT ESTIMATION AND SELECTION IN LOGISTIC LINEAR MIXED MODEL.
- Author
-
Hu, Liuyi, Lu, Wenbin, Zhou, Jin, and Zhou, Hua
- Subjects
Mathematical Sciences ,Statistics ,Generalized linear mixed model ,Laplace approximation ,MM algorithm ,variance components selection ,Other Mathematical Sciences ,Artificial Intelligence and Image Processing ,Statistics & Probability - Abstract
Logistic linear mixed models are widely used in experimental designs and genetic analyses of binary traits. Motivated by modern applications, we consider the case of many groups of random effects, where each group corresponds to a variance component. When the number of variance components is large, fitting a logistic linear mixed model is challenging. Thus, we develop two efficient and stable minorization-maximization (MM) algorithms for estimating variance components based on a Laplace approximation of the logistic model. One of these leads to a simple iterative soft-thresholding algorithm for variance component selection using the maximum penalized approximated likelihood. We demonstrate the variance component estimation and selection performance of our algorithms by means of simulation studies and an analysis of real data.
- Published
- 2019
45. Big data ordination towards intensive care event count cases using fast computing GLLVMS
- Author
-
Rezzy Eko Caraka, Rung-Ching Chen, Su-Wen Huang, Shyue-Yow Chiou, Prana Ugiana Gio, and Bens Pardamean
- Subjects
GLLVM ,Fast Computing ,Laplace Approximation ,Variational approximation ,Ordination ,Medicine (General) ,R5-920 - Abstract
Abstract Background In heart data mining and machine learning, dimension reduction is needed to remove multicollinearity. Meanwhile, it has been proven to improve the interpretation of the parameter model. In addition, dimension reduction can also increase the time of computing in high dimensional data. Methods In this paper, we perform high dimensional ordination towards event counts in intensive care hospital for Emergency Department (ED 1), First Intensive Care Unit (ICU1), Second Intensive Care Unit (ICU2), Respiratory Care Intensive Care Unit (RICU), Surgical Intensive Care Unit (SICU), Subacute Respiratory Care Unit (RCC), Trauma and Neurosurgery Intensive Care Unit (TNCU), Neonatal Intensive Care Unit (NICU) which use the Generalized Linear Latent Variable Models (GLLVM’s). Results During the analysis, we measure the performance and calculate the time computing of GLLVM by employing variational approximation and Laplace approximation, and compare the different distributions, including Negative Binomial, Poisson, Gaussian, ZIP, and Tweedie, respectively. GLLVMs (Generalized Linear Latent Variable Models), an extended version of GLMs (Generalized Linear Models) with latent variables, have fast computing time. The major challenge in latent variable modelling is that the function $$f\left(\varTheta \right)=\int f\left(u\varTheta \right)h\left(u\right)du$$ f Θ = ∫ f u Θ h u d u is not trivial to solve since the marginal likelihood involves integration over the latent variable u. Conclusions In a nutshell, GLLVMs lead as the best performance reaching the variance of 98% comparing other methods. We get the best model negative binomial and Variational approximation, which provides the best accuracy by accuracy value of AIC, AICc, and BIC. In a nutshell, our best model is GLLVM-VA Negative Binomial with AIC 7144.07 and GLLVM-LA Negative Binomial with AIC 6955.922.
- Published
- 2022
- Full Text
- View/download PDF
46. A double fixed rank kriging approach to spatial regression models with covariate measurement error.
- Author
-
Ning, Xu, Hui, Francis K. C., and Welsh, Alan H.
- Subjects
MEASUREMENT errors ,ERRORS-in-variables models ,KRIGING ,REGRESSION analysis ,BIRD breeding ,BIRD surveys - Abstract
In many applications of spatial regression modeling, the spatially indexed covariates are measured with error, and it is known that ignoring this measurement error can lead to attenuation of the estimated regression coefficients. Classical measurement error techniques may not be appropriate in the spatial setting, due to the lack of validation data and the presence of (residual) spatial correlation among the responses. In this article, we propose a double fixed rank kriging (FRK) approach to obtain bias‐corrected estimates of and inference on coefficients in spatial regression models, where the covariates are spatially indexed and subject to measurement error. Assuming they vary smoothly in space, the proposed method first fits an FRK model regressing the covariates against spatial basis functions to obtain predictions of the error‐free covariates. These are then passed into a second FRK model, where the response is regressed against the predicted covariates plus another set of spatial basis functions to account for spatial correlation. A simulation study and an application to presence–absence records of Carolina wren from the North American Breeding Bird Survey demonstrate that the proposed double FRK approach can be effective in adjusting for measurement error in spatially correlated data. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. Fast and universal estimation of latent variable models using extended variational approximations.
- Author
-
Korhonen, Pekka, Hui, Francis K. C., Niku, Jenni, and Taskinen, Sara
- Abstract
Generalized linear latent variable models (GLLVMs) are a class of methods for analyzing multi-response data which has gained considerable popularity in recent years, e.g., in the analysis of multivariate abundance data in ecology. One of the main features of GLLVMs is their capacity to handle a variety of responses types, such as (overdispersed) counts, binomial and (semi-)continuous responses, and proportions data. On the other hand, the inclusion of unobserved latent variables poses a major computational challenge, as the resulting marginal likelihood function involves an intractable integral for non-normally distributed responses. This has spurred research into a number of approximation methods to overcome this integral, with a recent and particularly computationally scalable one being that of variational approximations (VA). However, research into the use of VA for GLLVMs has been hampered by the fact that fully closed-form variational lower bounds have only been obtained for certain combinations of response distributions and link functions. In this article, we propose an extended variational approximations (EVA) approach which widens the set of VA-applicable GLLVMs dramatically. EVA draws inspiration from the underlying idea behind the Laplace approximation: by replacing the complete-data likelihood function with its second order Taylor approximation about the mean of the variational distribution, we can obtain a fully closed-form approximation to the marginal likelihood of the GLLVM for any response type and link function. Through simulation studies and an application to a species community of testate amoebae, we demonstrate how EVA results in a “universal” approach to fitting GLLVMs, which remains competitive in terms of estimation and inferential performance relative to both standard VA (where any intractable integrals are either overcome through reparametrization or quadrature) and a Laplace approximation approach, while being computationally more scalable than both methods in practice. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. Fast, Scalable Approximations to Posterior Distributions in Extended Latent Gaussian Models.
- Author
-
Stringer, Alex, Brown, Patrick, and Stafford, Jamie
- Subjects
- *
MARKOV chain Monte Carlo , *ERRORS-in-variables models , *STAR clusters , *MILKY Way , *STELLAR orbits , *SURVIVAL analysis (Biometry) - Abstract
We define a novel class of additive models, called Extended Latent Gaussian Models, that allow for a wide range of response distributions and flexible relationships between the additive predictor and mean response. The new class covers a broad range of interesting models including multi-resolution spatial processes, partial likelihood-based survival models, and multivariate measurement error models. Because computation of the exact posterior distribution is infeasible, we develop a fast, scalable approximate Bayesian inference methodology for this class based on nested Gaussian, Laplace, and adaptive quadrature approximations. We prove that the error in these approximate posteriors is o p (1) under standard conditions, and provide numerical evidence suggesting that our method runs faster and scales to larger datasets than methods based on Integrated Nested Laplace Approximations and Markov chain Monte Carlo, with comparable accuracy. We apply the new method to the mapping of malaria incidence rates in continuous space using aggregated data, mapping leukemia survival hazards using a Cox Proportional-Hazards model with a continuously-varying spatial process, and estimating the mass of the Milky Way Galaxy using noisy multivariate measurements of the positions and velocities of star clusters in its orbit. for this article are available online. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
49. Fast Expectation Propagation for Heteroscedastic, Lasso-Penalized, and Quantile Regression.
- Author
-
Zhou, Jackson, Ormerod, John T., and Grazian, Clara
- Subjects
- *
QUANTILE regression , *HETEROSCEDASTICITY , *BAYESIAN field theory , *MACHINE learning , *BIG data - Abstract
Expectation propagation (EP) is an approximate Bayesian inference (ABI) method which has seen widespread use across machine learning and statistics, owing to its accuracy and speed. However, it is often difficult to apply EP to models with complex likelihoods, where the EP updates do not have a tractable form and need to be calculated using methods such as multivariate numerical quadrature. These methods increase run time and reduce the appeal of EP as a fast approximate method. In this paper, we demonstrate that EP can still be made fast for certain models in this category. We focus on various types of linear regression, for which fast Bayesian inference is becoming increasingly important in the transition to big data. Fast EP updates are achieved through analytic integral reductions in certain moment computations. EP is compared to other ABI methods across simulations and benchmark datasets, and is shown to offer a good balance between accuracy and speed. [ABSTRACT FROM AUTHOR]
- Published
- 2023
50. Thin plate spline model under skew-normal random errors: estimation and diagnostic analysis for spatial data.
- Author
-
Cavieres, Joaquin, Ibacache-Pulgar, German, and Contreras-Reyes, Javier E.
- Subjects
- *
DIAGNOSTIC errors , *AUTOMATIC differentiation , *DATA analysis , *EXPECTATION-maximization algorithms , *HESSIAN matrices , *SPLINES , *LAPLACE distribution - Abstract
Expected Maximization (EM) algorithm is often used for estimation in semiparametric models with non-normal observations. However, the EM algorithm's main disadvantage is its slow convergence rate. In this paper, we propose the Laplace approximation to maximize the marginal likelihood, given a non-linear function assumed as a spline random effect for a skew-normal thin plate spline model. For this, we used automatic differentiation to get the derivatives and provide a numerical evaluation of the Hessian matrix. Comparative simulations and applications between the EM algorithm for thespatial dimension and Laplace approximation were carried out to illustrate the proposed method's performance. We show that the Laplace approximation is an efficient method, has flexibility to express log-likelihood in a semiparametric model and obtain a fast estimation process for non-normal models. In addition, a local influence analysis was carried out to evaluate the estimation sensitivity. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.