1,932 results on '"information criteria"'
Search Results
2. An exploratory penalized regression to identify combined effects of temporal variables—application to agri-environmental issues.
- Author
-
Fontez, Bénedicte, Loisel, Patrice, Simonneau, Thierry, and Hilgert, Nadine
- Subjects
- *
CROP quality , *CROPS , *CROP yields , *GRAPE quality , *REGRESSION analysis - Abstract
The development of sensors is opening new avenues in several fields of activity. Concerning agricultural crops, complex combinations of agri-environmental dynamics, such as soil and climate variables, are now commonly recorded. These new kinds of measurements are an opportunity to improve knowledge of the drivers of crop yield and crop quality at harvest. This involves renewing statistical approaches to account for the combined variations of these dynamic variables, here considered as temporal variables. The objective of the paper is to estimate an interpretable model to study the influence of the two combined inputs on a scalar output. A Sparse and Structured Procedure is proposed to Identify Combined Effects of Formatted temporal Predictors, hereafter denoted S pice FP. The method is based on the transformation of both temporal variables into categorical variables by defining joint modalities, from which a collection of multiple regression models is then derived. The regressors are the frequencies associated with joint class intervals. The class intervals and related regression coefficients are determined using a generalized fused lasso. S pice FP is a generic and exploratory approach. The simulations we performed show that it is flexible enough to select the non-null or influential modalities of values. A motivating example for grape quality is presented. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. A NEW ALGORITHM FOR MODELING ASYMMETRICAL DATA – AN EMPIRICAL STUDY.
- Author
-
Sakthivel, K. M. and G., Vidhya
- Subjects
- *
DATA modeling , *STATISTICAL models , *EMPIRICAL research - Abstract
In the current era, it is quite challenging to find symmetric data, as the form of most real-world data is asymmetric, meaning it tends to slant towards one side or another. These types of data emerge from various fields, including finance, economics, medicine, and reliability. Traditional statistical models often fail to handle such type of data as most of the statistical procedures are developed under normality assumptions. Therefore, the usual way of modeling these data results in incorrect predictions or leads to wrong decisions. There is no familiar methodology available in the research for modeling asymmetric data. Hence, there is a need to address this research gap as an emerging area of research in statistical modeling. In this paper, we propose a new systematic approach called the Model Selection Algorithm for modeling asymmetric data. In this algorithm, we incorporate various statistical tools and provide a guideline for a step-by-step procedure. Further, we have applied maximum likelihood estimation for parameter estimation, and model selection criteria such as Cramer Von Mises, Anderson Darling, and Kolmogorov Smirnov tests. We used real-time data to demonstrate the effectiveness of the algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
4. PanIC: Consistent information criteria for general model selection problems.
- Author
-
Nguyen, Hien Duy
- Subjects
- *
PRINCIPAL components analysis , *STATISTICAL learning , *MACHINE learning , *VECTOR analysis , *PANIC , *FINITE mixture models (Statistics) - Abstract
Summary: Model selection is a ubiquitous problem that arises in the application of many statistical and machine learning methods. In the likelihood and related settings, it is typical to use the method of information criteria (ICs) to choose the most parsimonious among competing models by penalizing the likelihood‐based objective function. Theorems guaranteeing the consistency of ICs can often be difficult to verify and are often specific and bespoke. We present a set of results that guarantee consistency for a class of ICs, which we call PanIC (from the Greek root 'pan', meaning 'of everything'), with easily verifiable regularity conditions. PanICs are applicable in any loss‐based learning problem and are not exclusive to likelihood problems. We illustrate the verification of regularity conditions for model selection problems regarding finite mixture models, least absolute deviation and support vector regression and principal component analysis, and demonstrate the effectiveness of PanICs for such problems via numerical simulations. Furthermore, we present new sufficient conditions for the consistency of BIC‐like estimators and provide comparisons of the BIC with PanIC. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Model selection for mixture hidden Markov models: an application to clickstream data.
- Author
-
Urso, Furio, Abbruzzo, Antonino, Chiodi, Marcello, and Cracolici, Maria Francesca
- Subjects
MONTE Carlo method ,MARKOV processes ,CORPORATE websites ,HOSPITALITY industry ,ACQUISITION of data - Abstract
In a clickstream analysis setting, Mixture Hidden Markov Models (MHMMs) can be used to examine categorical sequences assuming they evolve according to a mixture of latent Markov processes, each related to a different subpopulation. These models involve identifying both the number of subpopulations and hidden states. This study proposes a model selection criterion based on an integrated completed likelihood approach that accounts for the two latent classes in the model. We implemented a Monte Carlo simulation study to compare selection criteria performance. In scenarios characterised by categorical short length sequences, our proposed measure outperforms the most commonly used model selection criteria in identifying components and states. The paper presents a case study on clickstream data collected from the website of a company operating in the hospitality industry and modelled by an MHMM selected by the proposed score. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Lag order selection for long-run variance estimation in econometrics.
- Author
-
Morales, Marco
- Subjects
- *
GENERALIZED method of moments , *SPECTRAL energy distribution , *DENSITY matrices , *HETEROSCEDASTICITY , *ECONOMETRICS - Abstract
Estimating the long-run variance (LRV) is crucial for several econometric issues. Constructing reliable heteroskedasticity autocorrelation consistent (HAC) variance-covariance matrices and implementing efficient generalized method of moments (GMM) estimation procedures require a consistent LRV estimate. A good VARHAC estimator (HAC matrix with the spectral density at frequency zero constructed using a VAR spectral estimation) requires accurately estimating the sum of autoregressive (AR) coefficients; however, a criterion that minimizes the innovation variance does not necessarily yield the best spectral estimate. This article implements an optimal VARHAC estimator using an alternative information criterion, considering the bias in the sum of the parameters for the AR estimator of the spectral density at frequency zero. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. 非均匀杂波背景下的自适应检测器设计.
- Author
-
朱东升, SU Xiaojing, 马治勋, and 师英杰
- Subjects
FALSE alarms ,DETECTORS ,PRIOR learning - Abstract
Copyright of Systems Engineering & Electronics is the property of Journal of Systems Engineering & Electronics Editorial Department and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
8. A hybridized consistent Akaike type information criterion for regression models in the presence of multicollinearity.
- Author
-
Dünder, Emre
- Subjects
- *
MONTE Carlo method , *AKAIKE information criterion , *REGRESSION analysis , *INFORMATION measurement , *MULTICOLLINEARITY - Abstract
Consistent Akaike information criterion (CAIC) is an adjusted form of classical AIC. This criterion was developed by modifying the penalty. As a result, we propose a novel AIC type criterion, called CAIC ( n α ). The proposed criterion includes a dynamic parameter for controlling the penalty further. The distinctive feature of CAIC ( n α ) is to penalize multicollinearity level considering the information complexity measures. CAIC ( n α ) requires the α parameter, and in addition, a procedure is proposed to estimate α based on the information complexity of the regression model. Monte Carlo simulations and real data set examples demonstrate that CAIC ( n α ) performs better than classical information criteria for the potential multicollinearity problems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Order estimation for autoregressive models using criteria based on stochastic complexity.
- Author
-
Hamzaoui, Hassania, Moussa, Freedath Djibril, and El Matouat, Abdelaziz
- Subjects
- *
AUTOREGRESSIVE models , *STOCHASTIC orders , *SAMPLE size (Statistics) , *PERFORMANCE theory , *GENERALIZATION - Abstract
In this paper, we are interested in the order estimation of an autoregressive model using the information criterion developed by El Matouat and Hallin (1996), which is based on stochastic complexity. This criterion is a generalization of the Hannan and Quinn criterion and provides a convergence of the model order estimator, but it depends on a parameter that is sensitive to the sample size. In order to select the exact order of the candidate model, we propose a method for identifying the values of this parameter from the sample using the information contained in sub-samples of increasing size. To study the performance of the proposed method in comparison with the usual criteria, we simulated samples from autoregressive models on which we applied our procedure. Simulation results support the relevance of our procedure when compared to the Akaike criterion, the Hannan and Quinn criterion, and the Schwarz criterion. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. A MODIFIED AILAMUJIA DISTRIBUTION: PROPERTIES AND APPLICATION.
- Author
-
John, DAVID Ikwuoche, Nkiru, OKEKE Evelyn, and Lilian, FRANKLIN
- Subjects
- *
DISTRIBUTION (Probability theory) , *GENERATING functions - Abstract
This study presents a modified one-parameter Ailamujia distribution called the Entropy Transformed Ailamujia distribution (ETAD) is introduced to handle both symmetric and asymmetric lifetime data sets. The ETAD properties like order and reliability statistics, entropy, moment and moment generating function, quantile function, and its variability measures were derived. The maximum likelihood estimation (MLE) method was used in estimating the parameter of ETAD and through simulation at different sample sizes, the MLE was found to be consistent, efficient, and unbiased for estimating the ETAD parameter. The flexibility of ETAD was shown by fitting it to six different real lifetime data sets and compared it alongside seven competing oneparameter distributions. The goodness of fit (GOF) results from Akaike information criteria, Bayesian information criteria, corrected Akaike information criteria, and Hannan-Quinn information criteria show that the ETAD was the best fit amongst all the seven competing distributions across all the six data sets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
11. Is It Sufficient to Select the Optimal Class Number Based Only on Information Criteria in Fixed- and Random-Parameter Latent Class Discrete Choice Modeling Approaches?
- Author
-
Czine, Péter, Balogh, Péter, Blága, Zsanett, Szabó, Zoltán, Szekeres, Réka, Hess, Stephane, and Juhász, Béla
- Subjects
DISCRETE choice models ,CONSUMER preferences ,COVID-19 vaccines ,HETEROGENEITY - Abstract
Heterogeneity in preferences can be addressed through various discrete choice modeling approaches. The random-parameter latent class (RLC) approach offers a desirable alternative for analysts due to its advantageous properties of separating classes with different preferences and capturing the remaining heterogeneity within classes by including random parameters. For latent class specifications, however, more empirical evidence on the optimal number of classes to consider is needed in order to develop a more objective set of criteria. To investigate this question, we tested cases with different class numbers (for both fixed- and random-parameter latent class modeling) by analyzing data from a discrete choice experiment conducted in 2021 (examined preferences regarding COVID-19 vaccines). We compared models using commonly used indicators such as the Bayesian information criterion, and we took into account, among others, a seemingly simple but often overlooked indicator such as the ratio of significant parameter estimates. Based on our results, it is not sufficient to decide on the optimal number of classes in the latent class modeling based on only information criteria. We considered aspects such as the ratio of significant parameter estimates (it may be interesting to examine this both between and within specifications to find out which model type and class number has the most balanced ratio); the validity of the coefficients obtained (focusing on whether the conclusions are consistent with our theoretical model); whether including random parameters is justified (finding a balance between the complexity of the model and its information content, i.e., to examine when (and to what extent) the introduction of within-class heterogeneity is relevant); and the distributions of MRS calculations (since they often function as a direct measure of preferences, it is necessary to test how consistent the distributions of specifications with different class numbers are (if they are highly, i.e., relatively stable in explaining consumer preferences, it is probably worth putting more emphasis on the aspects mentioned above when choosing a model)). The results of this research raise further questions that should be addressed by further model testing in the future. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Agricultural Economic Water Productivity Differences across Counties in the Colorado River Basin.
- Author
-
Frisvold, George B. and Atla, Jyothsna
- Subjects
PRODUCTIVITY accounting ,FARM size ,SPECIALTY crops ,AGRICULTURAL laborers ,WATERSHEDS - Abstract
This study estimates the relative contribution of different factors to the wide variation in agricultural economic water productivity (EWP) across Colorado River Basin counties. It updates EWP measures for Basin counties using more detailed, localized data for the Colorado River mainstem. Using the Schwarz Bayesian Information Criterion for variable selection, regression analysis and productivity accounting methods identified factors contributing to EWP differences. The EWP was USD 1033 (USD 2023)/acre foot (af) for Lower Basin Counties on the U.S.–Mexico Border, USD 729 (USD 2023)/af for other Lower Basin Counties, and USD 168 (USD 2023)/af for Upper Basin Counties. Adoption rates for improved irrigation technologies showed little inter-county variation and so did not have a statistically significant impact on EWP. Counties with the lowest EWP consumed 25% of the Basin's agricultural water (>2.3 million af) to generate 3% of the Basin's crop revenue. Low populations/remoteness and more irrigated acreage per farm were negatively associated with EWP. Warmer winter temperatures and greater July humidity were positively associated with EWP. When controlling for other factors, being on the Border increased a county's EWP by USD 570 (2023 USD)/af. Border Counties have greater access to labor from Mexico, enabling greater production of high-value, labor-intensive specialty crops. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Variable selection for ordered categorical data in regression analysis: Information criteria vs. lasso
- Author
-
Mototsugu Fukushige
- Subjects
Variable selection ,ordered categorical data ,information criteria ,lasso ,Statistics ,HA1-4737 - Abstract
Variable selection in regression analysis with ordered categorical variables can be simplified by integrating some categories and introducing transformed dummy variables. This allows for the application of traditional variable selection criteria and lasso estimation. In this study, we compare the consistency of information criteria and lasso estimation through simulation studies and empirical example. The results show that BIC and adaptive lasso perform similarly in terms of whether the true number of explanatory variables is selected.
- Published
- 2024
- Full Text
- View/download PDF
14. A novel bootstrap goodness-of-fit test for normal linear regression models
- Author
-
Koeneman, Scott H. and Cavanaugh, Joseph E.
- Published
- 2024
- Full Text
- View/download PDF
15. A New Type of LASSO Regression Model with Cauchy Noise.
- Author
-
Ghatari, Amir Hossein, Aminghafari, Mina, and Mohammadpour, Adel
- Subjects
- *
REGRESSION analysis , *REGULARIZATION parameter - Abstract
Many datasets have heavy-tailed behavior, and classical penalized models are not appropriate for them. To treat this problem, we propose a penalized regression that handles model selection and outliers issues simultaneously. We provide a LASSO regression for models with Cauchy distributed noises using the negative log-likelihood loss function. To select the regularization parameter, we define AIC and BIC type criteria. We study the distribution of the regression coefficients estimator in the simulation experiments. In addition, simulation study and real datasets analysis confirm the superiority of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Using cross‐validation methods to select time series models: Promises and pitfalls.
- Author
-
Liu, Siwei and Zhou, Di Jody
- Subjects
- *
VECTOR autoregression model , *STATISTICAL models , *PREDICTION models , *TIME series analysis - Abstract
Vector autoregressive (VAR) modelling is widely employed in psychology for time series analyses of dynamic processes. However, the typically short time series in psychological studies can lead to overfitting of VAR models, impairing their predictive ability on unseen samples. Cross‐validation (CV) methods are commonly recommended for assessing the predictive ability of statistical models. However, it is unclear how the performance of CV is affected by characteristics of time series data and the fitted models. In this simulation study, we examine the ability of two CV methods, namely,10‐fold CV and blocked CV, in estimating the prediction errors of three time series models with increasing complexity (person‐mean, AR, and VAR), and evaluate how their performance is affected by data characteristics. We then compare these CV methods to the traditional methods using the Akaike (AIC) and Bayesian (BIC) information criteria in their accuracy of selecting the most predictive models. We find that CV methods tend to underestimate prediction errors of simpler models, but overestimate prediction errors of VAR models, particularly when the number of observations is small. Nonetheless, CV methods, especially blocked CV, generally outperform the AIC and BIC. We conclude our study with a discussion on the implications of the findings and provide helpful guidelines for practice. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Prediction with data from designed experimentation.
- Author
-
D'Ottaviano, Fabio
- Subjects
- *
FORECASTING , *MODEL validation - Abstract
The intent of this study was to understand via simulation how data from designed experimentation for linear models can succeed in the prediction of individual values despite its relatively small size which renders data splitting for validation purposes nonviable. Another intent was to emphasize why, for a given level of precision, designed experimentation requires far many more runs for the prediction of individual values than it does for its more mundane use of mean prediction, and how this required number of runs can be determined via simulation as a function of the model validation method used. The results showed that prediction with designed data can be successful given its low tendency to overfitting and that model reduction can be detrimental to prediction which contrasts with the pursuit of a bias-variance tradeoff with undesigned data. As designed data increasingly resembles undesigned data, either by containing factors that have zero influence in the response, having high correlation among factors levels, and/or having small n to p ratio, model reduction becomes increasingly necessary for prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. A simple portmanteau test with data-driven truncation point.
- Author
-
Baragona, Roberto, Battaglia, Francesco, and Cucina, Domenico
- Subjects
- *
TIME series analysis , *ABSOLUTE value , *AUTOCORRELATION (Statistics) , *WHITE noise , *CHI-squared test , *FORECASTING - Abstract
Time series forecasting is an important application of many statistical methods. When it is appropriate to assume that the data may be projected towards the future based on the past history of the dataset, a preliminary examination is usually required to ensure that the data sequence is autocorrelated. This is a quite obvious assumption that has to be made and can be the object of a formal test of hypotheses. The most widely used test is the portmanteau test, i.e., a sum of the squared standardized autocorrelations up to an appropriate maximum lag (the truncation point). The choice of the truncation point is not obvious and may be data-driven exploiting supplementary information, e.g. the largest autocorrelation and the lag where such maximum is found. In this paper, we propose a portmanteau test with a truncation point equal to the lag of the largest (absolute value) estimated autocorrelation. Theoretical and simulation-based comparisons based on size and power are performed with competing portmanteau tests, and encouraging results are obtained. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Foundational basis for optimal climate change detection from energy-balance and cointegration models
- Author
-
Cummins, D., Stott, Peter, and Stephenson, David
- Subjects
climate ,climate change ,detection ,attribution ,D&A ,global warming ,optimal fingerprinting ,optimal detection ,time series ,energy balance model ,radiative forcing ,climate sensitivity ,detection and attribution ,cointegration ,co-integration ,spurious regression ,autoregressive moving average ,ARMA ,linear filtering ,digital filter ,maximum likelihood estimation ,Kalman filter ,information criteria ,latent variables ,time series analysis ,EBM ,stochastic processes ,impulse response ,climate model ,global mean surface temperature ,uncertainty quantification ,recursive filter ,recursive estimation ,simple climate model ,model calibration ,anthropogenic climate change ,regression ,linear regression ,least squares - Abstract
Foundational basis for optimal climate change detection from energy-balance and cointegration models This thesis has critically examined the validity of optimal fingerprinting methods for the detection and attribution (D&A) of climate change trends. The validity is called into question because optimal fingerprinting involves a linear regression of non-stationary time series. Such non-stationary regressions are in general statistically inconsistent, meaning they are liable to produce spurious results. This thesis has investigated, using an idealized linear-response-model framework motivated by energy-balance considerations, whether the standard assumptions of optimal fingerprinting are sufficient to guarantee consistency, and hence whether detected climate trends are likely to be genuine or artefacts of spurious correlation. The principal reasoning tool in the thesis is the linear impulse-response model, familiar to many climatologists when parameterized as an energy-balance model (EBM), a simplified representation of global climate. A rigorous and efficient maximum likelihood method has been developed for estimating parameters of EBMs with any k > 0 number of boxes from CO2-quadrupling general circulation model (GCM) experiments and the method implemented as a free software package. It has been found that a three-box ocean is optimal for emulating the global mean surface temperature (GMST) impulse responses of GCMs in the Coupled Model Intercomparison Project Phase 5 (CMIP5). A new linear-filtering method has also been developed for estimating historical effective radiative forcing (ERF) from time series of GMST. It has been shown that the response of any k-box EBM can be represented as an ARMA(k, k-1) autoregressive moving-average filter and that, by inverting the ARMA filter, time series of surface temperature may be converted into radiative forcing. A comparison with an established method ("ERF_trans"), using historical simulations from HadGEM3-GC31-LL, found that the new method gives an ERF time series that closely matches published results (correlation of 0.83). Applying the new method to historical temperature observations, in combination with HadGEM3, produces evidence of a significant increase in ERF over the historical period with an estimated forcing in 2018 of 1.45 +- 0.504 Watts per square metre. It has been proved, using an idealized linear-response-model framework where forcing is represented as an integrated process, that if standard assumptions hold then the optimal fingerprinting estimator is consistent, and hence robust against spurious regression. Hypothesis tests, conducted using historical GMST observations and simulation output from 13 GCMs of the CMIP6 generation, have produced no evidence that these assumptions are violated in practice. The historical trends in GMST which are detected and attributed using these GCMs are therefore very likely not spurious. Consistency of the fingerprinting estimator was found to depend on "cointegration" between historical observations and GCM output. Detection of such a cointegration for the GMST variable indicates that the least-squares estimator is "superconsistent", with better convergence properties than might previously have been assumed. Furthermore, a new method has been developed for quantifying D&A uncertainty, which exploits the connection between cointegration and error-correction time series models to eliminate the need for pre-industrial control simulations.
- Published
- 2022
20. Is It Sufficient to Select the Optimal Class Number Based Only on Information Criteria in Fixed- and Random-Parameter Latent Class Discrete Choice Modeling Approaches?
- Author
-
Péter Czine, Péter Balogh, Zsanett Blága, Zoltán Szabó, Réka Szekeres, Stephane Hess, and Béla Juhász
- Subjects
heterogeneity in preferences ,latent class modeling ,optimal class number ,information criteria ,Economics as a science ,HB71-74 - Abstract
Heterogeneity in preferences can be addressed through various discrete choice modeling approaches. The random-parameter latent class (RLC) approach offers a desirable alternative for analysts due to its advantageous properties of separating classes with different preferences and capturing the remaining heterogeneity within classes by including random parameters. For latent class specifications, however, more empirical evidence on the optimal number of classes to consider is needed in order to develop a more objective set of criteria. To investigate this question, we tested cases with different class numbers (for both fixed- and random-parameter latent class modeling) by analyzing data from a discrete choice experiment conducted in 2021 (examined preferences regarding COVID-19 vaccines). We compared models using commonly used indicators such as the Bayesian information criterion, and we took into account, among others, a seemingly simple but often overlooked indicator such as the ratio of significant parameter estimates. Based on our results, it is not sufficient to decide on the optimal number of classes in the latent class modeling based on only information criteria. We considered aspects such as the ratio of significant parameter estimates (it may be interesting to examine this both between and within specifications to find out which model type and class number has the most balanced ratio); the validity of the coefficients obtained (focusing on whether the conclusions are consistent with our theoretical model); whether including random parameters is justified (finding a balance between the complexity of the model and its information content, i.e., to examine when (and to what extent) the introduction of within-class heterogeneity is relevant); and the distributions of MRS calculations (since they often function as a direct measure of preferences, it is necessary to test how consistent the distributions of specifications with different class numbers are (if they are highly, i.e., relatively stable in explaining consumer preferences, it is probably worth putting more emphasis on the aspects mentioned above when choosing a model)). The results of this research raise further questions that should be addressed by further model testing in the future.
- Published
- 2024
- Full Text
- View/download PDF
21. EXPONENTIATED GENERALIZED RAMOS-LOUZADA DISTRIBUTION WITH PROPERTIES AND APPLICATIONS.
- Author
-
ALTINISIK, Yasin and CANKAYA, Emel
- Subjects
- *
UNCERTAINTY (Information theory) , *GENERATING functions , *HAZARD function (Statistics) , *SURVIVAL rate , *ORDER statistics , *GENERALIZATION - Abstract
In this paper, we propose a new generalization of Ramos-Louzada (RL) distribution based on two additional shape parameters. Along with the genesis of its distributional form, the derivation of cumulative density function (cdf), survival and hazard rate functions, the quantile function (qf), moments, moment generating function (mgf), Shannon and Renyi entropies, order statistics and a linear representation of the proposed distribution are inspected. Several estimation methods of the model parameters are discussed throughout two comprehensive simulation studies conducted to compare its performance against some lifetime distributions. Application of a real dataset is presented to illustrate the potentiality of this distribution in line with the simulation studies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Regularized Information Loss for Improved Model Selection
- Author
-
Kamalov, Firuz, Moussa, Sherif, Reyes, Jorge Avante, Xhafa, Fatos, Series Editor, Rajakumar, G., editor, Du, Ke-Lin, editor, and Rocha, Álvaro, editor
- Published
- 2023
- Full Text
- View/download PDF
23. Introduction to Generalized Linear Mixed Models
- Author
-
Xia, Yinglin, Sun, Jun, Xia, Yinglin, and Sun, Jun
- Published
- 2023
- Full Text
- View/download PDF
24. On the Disagreement of Forecasting Model Selection Criteria
- Author
-
Evangelos Spiliotis, Fotios Petropoulos, and Vassilios Assimakopoulos
- Subjects
model selection ,information criteria ,time series ,exponential smoothing ,M4 competition ,Science (General) ,Q1-390 ,Mathematics ,QA1-939 - Abstract
Forecasters have been using various criteria to select the most appropriate model from a pool of candidate models. This includes measurements on the in-sample accuracy of the models, information criteria, and cross-validation, among others. Although the latter two options are generally preferred due to their ability to tackle overfitting, in univariate time-series forecasting settings, limited work has been conducted to confirm their superiority. In this study, we compared such popular criteria for the case of the exponential smoothing family of models using a large data set of real series. Our results suggest that there is significant disagreement between the suggestions of the examined criteria and that, depending on the approach used, models of different complexity may be favored, with possible negative effects on the forecasting accuracy. Moreover, we find that simple in-sample error measures can effectively select forecasting models, especially when focused on the most recent observations in the series.
- Published
- 2023
- Full Text
- View/download PDF
25. Adaptive information-based methods for determining the co-integration rank in heteroskedastic VAR models.
- Author
-
Peter Boswijk, H., Cavaliere, Giuseppe, De Angelis, Luca, and Taylor, A. M. Robert
- Subjects
- *
VECTOR autoregression model , *MONTE Carlo method , *ADAPTIVE testing , *COVARIANCE matrices , *TAX penalties - Abstract
Standard methods, such as sequential procedures based on Johansen's (pseudo-)likelihood ratio (PLR) test, for determining the co-integration rank of a vector autoregressive (VAR) system of variables integrated of order one can be significantly affected, even asymptotically, by unconditional heteroskedasticity (non-stationary volatility) in the data. Known solutions to this problem include wild bootstrap implementations of the PLR test or the use of an information criterion, such as the BIC, to select the co-integration rank. Although asymptotically valid in the presence of heteroskedasticity, these methods can display very low finite sample power under some patterns of non-stationary volatility. In particular, they do not exploit potential efficiency gains that could be realized in the presence of non-stationary volatility by using adaptive inference methods. Under the assumption of a known autoregressive lag length, Boswijk and Zu develop adaptive PLR test based methods using a non-parametric estimate of the covariance matrix process. It is well-known, however, that selecting an incorrect lag length can significantly impact on the efficacy of both information criteria and bootstrap PLR tests to determine co-integration rank in finite samples. We show that adaptive information criteria-based approaches can be used to estimate the autoregressive lag order to use in connection with bootstrap adaptive PLR tests, or to jointly determine the co-integration rank and the VAR lag length and that in both cases they are weakly consistent for these parameters in the presence of non-stationary volatility provided standard conditions hold on the penalty term. Monte Carlo simulations are used to demonstrate the potential gains from using adaptive methods and an empirical application to the U.S. term structure is provided. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
26. Using information criteria to select smoothing parameters when analyzing survival data with time-varying coefficient hazard models.
- Author
-
Luo, Lingfeng, He, Kevin, Wu, Wenbo, and Taylor, Jeremy MG
- Subjects
- *
AKAIKE information criterion , *PANCREATIC cancer , *CONFIDENCE intervals , *HAZARDS - Abstract
Analyzing the large-scale survival data from the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) Program may help guide the management of cancer. Detecting and characterizing the time-varying effects of factors collected at the time of diagnosis could reveal important and useful patterns. However, fitting a time-varying effect model by maximizing the partial likelihood with such large-scale survival data is not feasible with most existing software. Moreover, estimating time-varying coefficients using spline based approaches requires a moderate number of knots, which may lead to unstable estimation and over-fitting issues. To resolve these issues, adding a penalty term greatly aids estimation. The selection of penalty smoothing parameters is difficult in this time-varying setting, as traditional ways like using Akaike information criterion do not work, while cross-validation methods have a heavy computational burden, leading to unstable selections. We propose modified information criteria to determine the smoothing parameter and a parallelized Newton-based algorithm for estimation. We conduct simulations to evaluate the performance of the proposed method. We find that penalization with the smoothing parameter chosen by a modified information criteria is effective at reducing the mean squared error of the estimated time-varying coefficients. Compared to a number of alternatives, we find that the estimates of the variance derived from Bayesian considerations have the best coverage rates of confidence intervals. We apply the method to SEER head-and-neck, colon, prostate, and pancreatic cancer data and detect the time-varying nature of various risk factors. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
27. Information criteria for model selection.
- Author
-
Zhang, Jiawei, Yang, Yuhong, and Ding, Jie
- Subjects
- *
INFORMATION theory , *COMMON misconceptions , *STATISTICAL learning , *DATA science , *GRAPHICAL modeling (Statistics) , *AKAIKE information criterion - Abstract
The rapid development of modeling techniques has brought many opportunities for data‐driven discovery and prediction. However, this also leads to the challenge of selecting the most appropriate model for any particular data task. Information criteria, such as the Akaike information criterion (AIC) and Bayesian information criterion (BIC), have been developed as a general class of model selection methods with profound connections with foundational thoughts in statistics and information theory. Many perspectives and theoretical justifications have been developed to understand when and how to use information criteria, which often depend on particular data circumstances. This review article will revisit information criteria by summarizing their key concepts, evaluation metrics, fundamental properties, interconnections, recent advancements, and common misconceptions to enrich the understanding of model selection in general. This article is categorized under:Data: Types and Structure > Traditional Statistical DataStatistical Learning and Exploratory Methods of the Data Sciences > Modeling MethodsStatistical and Graphical Methods of Data Analysis > Information Theoretic MethodsStatistical Models > Model Selection [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
28. تبیین مؤلفه های خدمات کتابخانه ها و مراکز اطالعرسانی بر اساس معیارهای جامعه اطالعاتی و ابعاد جهانی شدن با رویکرد نظریه دادهبنیاد.
- Author
-
مهناز محسنی, سعید غفاری, | مریم سالمی, and محمودمرادی
- Abstract
Objective: The purpose of this research was to identify and explain the service components of libraries and information centers based on information society criteria and dimensions of globalization. Methods: This applied qualitative research was conducted using the method of qualitative content analysis, grounded theory, and interview tools. The sampling method was purposeful. The research community consists of 22 experts in library and information science with the specialized field of information society, globalization, and evaluation of library services, with whom a semi-structured interview was conducted. Results: The findings of the interviews were presented in the form of axial coding and open coding in five sections: causal conditions, strategies, contextual conditions, intervening conditions, and consequences. The core codes for the causal conditions of the service components of libraries and information centers are: economic factors, production and organization of knowledge and information, infrastructural factors, etc. The core codes for the strategies of the service components of libraries and information centers include coordination and integration, planning Strategic, analysis of internal and external environments. The core codes for the background conditions of the service components of libraries and information centers include legal and economic factors, cultural factors, communication, and interactive factors, etc. The core codes for the intervening or modulating conditions of the service components of libraries and information centers are: budget and income factors, socio-cultural diversity, macro and management factors, etc. The core codes for the consequences of the service components of libraries and information centers consist of the development and exchange of communication, the development and strengthening of knowledge management, the strengthening of the library's position, the improvement of infrastructure, differentiated library services, and sustainable development, each of these core codes includes several subcategories (open codes). Conclusions: Several factors such as causal conditions, strategies, background conditions, intervening conditions lead to increasing or decreasing the efficiency of academic libraries. This will lead to positive consequences in the progress and development of these libraries if the strengths are strengthened and the weaknesses are eliminated. In this regard, it is expected that according to the criteria set for the information society, they will improve their performance, and according to global dimensions, they should provide a suitable ground for the research and educational activities of professors, students, and researchers. According to the obtained results, it can be said that it is very important for academic libraries and information centers to provide comprehensive services using modern technologies and in accordance with the standards of the information society and global dimensions. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Using information criteria to select averages in CCE.
- Author
-
Margaritella, Luca and Westerlund, Joakim
- Subjects
PANEL analysis - Abstract
In the interactive effects panel data literature information criteria are commonly used to consistently determine which of the estimated principal components factors to include. The present paper shows that the same approach can be applied to factors estimated by taking the cross-sectional averages of the observables, as prescribed by the popular common correlated effects (CCE) approach. This should be useful to practitioners because at the moment there is no other theory that justifies the use of information criteria in CCE. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. Consistent Model Selection Procedure for Random Coefficient INAR Models.
- Author
-
Yu, Kaizhi and Tao, Tielai
- Subjects
- *
COMMUNICABLE diseases - Abstract
In the realm of time series data analysis, information criteria constructed on the basis of likelihood functions serve as crucial instruments for determining the appropriate lag order. However, the intricate structure of random coefficient integer-valued time series models, which are founded on thinning operators, complicates the establishment of likelihood functions. Consequently, employing information criteria such as AIC and BIC for model selection becomes problematic. This study introduces an innovative methodology that formulates a penalized criterion by utilizing the estimation equation within conditional least squares estimation, effectively addressing the aforementioned challenge. Initially, the asymptotic properties of the penalized criterion are derived, followed by a numerical simulation study and a comparative analysis. The findings from both theoretical examinations and simulation investigations reveal that this novel approach consistently selects variables under relatively relaxed conditions. Lastly, the applications of this method to infectious disease data and seismic frequency data produce satisfactory outcomes. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. Penalized leads-and-lags cointegrating regression: a simulation study and two empirical applications.
- Author
-
Neto, David
- Subjects
CARBON emissions ,EMPIRICAL research ,LEAST squares ,BITCOIN ,MARKET sentiment - Abstract
When leads and lags are added to a cointegrating regression to eliminate endogeneity bias, overfitting and multicollinearity problems can arise. For this purpose, we propose a regularized extension of the conventional dynamic ordinary least squares (DOLS) estimator which facilitates lead–lag selection and improves estimate accuracy. Simulation experiments show that the proposed approach outperforms traditional selection procedures, in terms of precision and accuracy. We propose two empirical applications to illustrate the outlined methodology. The first one revisits the effect of media attention on Bitcoin trading volume, which is highly exposed to endogeneity bias due to a two-way causal effect. Our results show that the proposed procedure leads to a lower mean absolute error than when one uses conventional procedures. In a second empirical illustration, we apply the methodology to carbon dioxide emissions forecasting. The case of France is examined. Our estimates show that the penalized leads-and-lags cointegrating regression outperforms DOLS for long horizons. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
32. Model Selection with Missing Data Embedded in Missing-at-Random Data
- Author
-
Keiji Takai and Kenichi Hayashi
- Subjects
information criteria ,missing at random ,missing data ,not missing at random ,Statistics ,HA1-4737 - Abstract
When models are built with missing data, an information criterion is needed to select the best model among the various candidates. Using a conventional information criterion for missing data may lead to the selection of the wrong model when data are not missing at random. Conventional information criteria implicitly assume that any subset of missing-at-random data is also missing at random, and thus the maximum likelihood estimator is assumed to be consistent; that is, it is assumed that the estimator will converge to the true value. However, this assumption may not be practical. In this paper, we develop an information criterion that works even for not-missing-at-random data, so long as the largest missing data set is missing at random. Simulations are performed to show the superiority of the proposed information criterion over conventional criteria.
- Published
- 2023
- Full Text
- View/download PDF
33. Quantifying model selection uncertainty via bootstrapping and Akaike weights.
- Author
-
Rigdon, Edward, Sarstedt, Marko, and Moisescu, Ovidiu‐Ioan
- Subjects
CUSTOMER loyalty ,EMPIRICAL research ,CONFIDENCE - Abstract
Picking one 'winner' model for researching a certain phenomenon while discarding the rest implies a confidence that may misrepresent the evidence. Multimodel inference allows researchers to more accurately represent their uncertainty about which model is 'best'. But multimodel inference, with Akaike weights—weights reflecting the relative probability of each candidate model—and bootstrapping, can also be used to quantify model selection uncertainty, in the form of empirical variation in parameter estimates across models, while minimizing bias from dubious assumptions. This paper describes this approach. Results from a simulation example and an empirical study on the impact of perceived brand environmental responsibility on customer loyalty illustrate and provide support for our proposed approach. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
34. Bayesian model averaging in longitudinal studies using Bayesian variable selection methods.
- Author
-
Yimer, Belay Birlie, Otava, Martin, Degefa, Teshome, Yewhalaw, Delenasaw, and Shkedy, Ziv
- Subjects
- *
LONGITUDINAL method , *INFERENTIAL statistics , *ESTIMATION bias , *PANEL analysis , *AKAIKE information criterion , *PARAMETER estimation - Abstract
Parameter estimation is often considered as a post model selection problem, i.e., the parameters of interest are often estimated based on "the best" model. However, this approach does not take into account that "the best" model was selected from a set of possible models. Ignoring this uncertainty may lead to bias in estimation. In this paper, we present a Bayesian variable selection (BVS) approach for model averaging which would address the model uncertainty. Although averaging would be preferred approach, BVS can be used as well for model selection if the interest is to select one among the set of candidate models. The performance of Bayesian variable selection is compared with the information criterion based model averaging on real longitudinal data and through simulations study. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. Model Selection with Missing Data Embedded in Missing-at-Random Data.
- Author
-
Takai, Keiji and Hayashi, Kenichi
- Subjects
MISSING data (Statistics) ,MAXIMUM likelihood detection ,INFORMATION retrieval ,COMPUTER simulation ,DATA analysis - Abstract
When models are built with missing data, an information criterion is needed to select the best model among the various candidates. Using a conventional information criterion for missing data may lead to the selection of the wrong model when data are not missing at random. Conventional information criteria implicitly assume that any subset of missing-at-random data is also missing at random, and thus the maximum likelihood estimator is assumed to be consistent; that is, it is assumed that the estimator will converge to the true value. However, this assumption may not be practical. In this paper, we develop an information criterion that works even for not-missing-at-random data, so long as the largest missing data set is missing at random. Simulations are performed to show the superiority of the proposed information criterion over conventional criteria. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. On the Disagreement of Forecasting Model Selection Criteria.
- Author
-
Spiliotis, Evangelos, Petropoulos, Fotios, and Assimakopoulos, Vassilios
- Subjects
TIME series analysis ,STATISTICAL smoothing ,UNIVARIATE analysis ,BAYESIAN analysis ,HEURISTIC algorithms - Abstract
Forecasters have been using various criteria to select the most appropriate model from a pool of candidate models. This includes measurements on the in-sample accuracy of the models, information criteria, and cross-validation, among others. Although the latter two options are generally preferred due to their ability to tackle overfitting, in univariate time-series forecasting settings, limited work has been conducted to confirm their superiority. In this study, we compared such popular criteria for the case of the exponential smoothing family of models using a large data set of real series. Our results suggest that there is significant disagreement between the suggestions of the examined criteria and that, depending on the approach used, models of different complexity may be favored, with possible negative effects on the forecasting accuracy. Moreover, we find that simple in-sample error measures can effectively select forecasting models, especially when focused on the most recent observations in the series. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
37. Variable selection using a smooth information criterion for distributional regression models.
- Author
-
O’Neill, Meadhbh and Burke, Kevin
- Abstract
Modern variable selection procedures make use of penalization methods to execute simultaneous model selection and estimation. A popular method is the least absolute shrinkage and selection operator, the use of which requires selecting the value of a tuning parameter. This parameter is typically tuned by minimizing the cross-validation error or Bayesian information criterion, but this can be computationally intensive as it involves fitting an array of different models and selecting the best one. In contrast with this standard approach, we have developed a procedure based on the so-called “smooth IC” (SIC) in which the tuning parameter is automatically selected in one step. We also extend this model selection procedure to the distributional regression framework, which is more flexible than classical regression modelling. Distributional regression, also known as multiparameter regression, introduces flexibility by taking account of the effect of covariates through multiple distributional parameters simultaneously, e.g., mean and variance. These models are useful in the context of normal linear regression when the process under study exhibits heteroscedastic behaviour. Reformulating the distributional regression estimation problem in terms of penalized likelihood enables us to take advantage of the close relationship between model selection criteria and penalization. Utilizing the SIC is computationally advantageous, as it obviates the issue of having to choose multiple tuning parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
38. Forecast Selection and Representativeness.
- Author
-
Petropoulos, Fotios and Siemsen, Enno
- Subjects
DECISION making ,FORECASTING ,JUDGMENT (Psychology) ,STATISTICAL models ,BEHAVIORAL economics - Abstract
Effective approaches to forecast model selection are crucial to improve forecast accuracy and to facilitate the use of forecasts for decision-making processes. Information criteria or cross-validation are common approaches of forecast model selection. Both methods compare forecasts with the respective actual realizations. However, no existing selection method assesses out-of-sample forecasts before the actual values become available—a technique used in human judgment in this context. Research in judgmental model selection emphasizes that human judgment can be superior to statistical selection procedures in evaluating the quality of forecasting models. We, therefore, propose a new way of statistical model selection based on these insights from human judgment. Our approach relies on an asynchronous comparison of forecasts and actual values, allowing for an ex ante evaluation of forecasts via representativeness. We test this criterion on numerous time series. Results from our analyses provide evidence that forecast performance can be improved when models are selected based on their representativeness. This paper was accepted by Manel Baucells, behavioral economics and decision analysis. Supplemental Material: The online appendix and data are available at https://doi.org/10.1287/mnsc.2022.4485. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. A Simulation Study on Latent Transition Analysis for Examining Profiles and Trajectories in Education: Recommendations for Fit Statistics.
- Author
-
Edelsbrunner, Peter A., Flaig, Maja, and Schneider, Michael
- Subjects
MONTE Carlo method ,AKAIKE information criterion ,REAL numbers ,STATISTICS ,SAMPLE size (Statistics) - Abstract
Latent transition analysis is an informative statistical tool for depicting heterogeneity in learning as latent profiles. We present a Monte Carlo simulation study to guide researchers in selecting fit indices for identifying the correct number of profiles. We simulated data representing profiles of learners within a typical pre- post- follow up-design with continuous indicators, varying sample size (N from 50 to 1,000), attrition rate (none/10% per wave), and profile separation (entropy; from.73 to.87). Results indicate that the most commonly used fit index, the Bayesian information criterion (BIC), and the consistent Akaike information criterion (CAIC) consistently underestimate the real number of profiles. A combination of the AIC or the AIC3 with the adjusted Bayesian Information Criterion (aBIC) provides the most precise choice for selecting the number of profiles and is accurate with sample sizes of at least N = 200. The AIC3 excels starting from N = 500. Results were mostly robust toward differing numbers of time points, profiles, indicator variables, and alternative profiles. We provide an online tool for computing these fit indices and discuss implications for research. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. Improving the Performance and Stability of TIC and ICE.
- Author
-
Ward, Tyler
- Subjects
- *
AKAIKE information criterion - Abstract
Takeuchi's Information Criterion (TIC) was introduced as a generalization of Akaike's Information Criterion (AIC) in 1976. Though TIC avoids many of AIC's strict requirements and assumptions, it is only rarely used. One of the reasons for this is that the trace term introduced in TIC is numerically unstable and computationally expensive to compute. An extension of TIC called ICE was published in 2021, which allows this trace term to be used for model fitting (where it was primarily compared to L2 regularization) instead of just model selection. That paper also examined numerically stable and computationally efficient approximations that could be applied to TIC or ICE, but these approximations were only examined on small synthetic models. This paper applies and extends these approximations to larger models on real datasets for both TIC and ICE. This work shows the practical models may use TIC and ICE in a numerically stable way to achieve superior results at a reasonable computational cost. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. Evidence of an Absence of Inbreeding Depression in a Wild Population of Weddell Seals (Leptonychotes weddellii).
- Author
-
Powell, John H., Kalinowski, Steven T., Taper, Mark L., Rotella, Jay J., Davis, Corey S., and Garrott, Robert A.
- Subjects
- *
INBREEDING , *BIOLOGICAL fitness , *MENTAL depression , *LABOR time , *INDEPENDENT variables , *MICROSATELLITE repeats - Abstract
Inbreeding depression can reduce the viability of wild populations. Detecting inbreeding depression in the wild is difficult; developing accurate estimates of inbreeding can be time and labor intensive. In this study, we used a two-step modeling procedure to incorporate uncertainty inherent in estimating individual inbreeding coefficients from multilocus genotypes into estimates of inbreeding depression in a population of Weddell seals (Leptonychotes weddellii). The two-step modeling procedure presented in this paper provides a method for estimating the magnitude of a known source of error, which is assumed absent in classic regression models, and incorporating this error into inferences about inbreeding depression. The method is essentially an errors-in-variables regression with non-normal errors in both the dependent and independent variables. These models, therefore, allow for a better evaluation of the uncertainty surrounding the biological importance of inbreeding depression in non-pedigreed wild populations. For this study we genotyped 154 adult female seals from the population in Erebus Bay, Antarctica, at 29 microsatellite loci, 12 of which are novel. We used a statistical evidence approach to inference rather than hypothesis testing because the discovery of both low and high levels of inbreeding are of scientific interest. We found evidence for an absence of inbreeding depression in lifetime reproductive success, adult survival, age at maturity, and the reproductive interval of female seals in this population. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. The Impact of Groundwater Model Parametrization on Calibration Fit and Prediction Accuracy—Assessment in the Form of a Post-Audit at the SLOVNAFT Oil Refinery Site, in Slovakia.
- Author
-
Zatlakovič, Martin, Krčmář, Dávid, Hodasová, Kamila, Sracek, Ondra, Marenčák, Štefan, Durdiaková, Ľubica, and Bugár, Alexander
- Subjects
PETROLEUM refineries ,AKAIKE information criterion ,GROUNDWATER flow ,HYDRAULIC conductivity ,GROUNDWATER ,WELLHEAD protection ,GROUNDWATER recharge - Abstract
The present work is focused on the effect of increasing model complexity on calibration fit and prediction accuracy. Groundwater flow was numerically simulated at a field site with a hydraulic groundwater protection system in operation with many pumping and observation wells at the site of the Slovnaft refinery in southwestern Slovakia. The adjusted parameters during the calibration included hydraulic conductivity, as well as recharge, evapotranspiration, and riverbed conductance. Four model scenarios were built (V1–V4) within the model calibration for the conditions in the year 2008, with increasing complexity mainly within artificial K-field zonation, which was created and step-wise upgraded based on groundwater head residuals' distribution. Selected descriptive statistics were evaluated together with chosen information criteria after the models were calibrated. Subsequently, the real predictive accuracy of individual calibrated scenarios was evaluated for conditions in the year 2019 in the form of a post-audit. Within the overall evaluation, the calibration fit increased with increased parameterization complexity. However, the Akaike information criterion, corrected Akaike information criterion, and Bayesian information criterion detected opposite trends for model predictability. A post-audit of prediction accuracy revealed a significant improvement of the V2, V3, and V4 scenarios against the simplest V1 scenario. However, among the V2–V4 scenarios, the degree of prediction accuracy improvement was almost insignificant. The level of effort spent on V3 and V4 parameterization seems disproportionate to the benefit of a negligible improvement in prediction accuracy. Groundwater flow path analysis showed that similarly successful scenarios (measured by prediction accuracy) can generate very different groundwater pathlines. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Nonlinear split-plot design modeling and analysis of rice varieties yield
- Author
-
I.J. David, O.E. Asiribo, and H.G. Dikko
- Subjects
Chapman-Richards function ,Intrinsically nonlinear ,Field experiments ,Restricted maximum likelihood estimation ,Median adequacy measures ,information criteria ,Science - Abstract
In this research, a class of nonlinear split plot design models where the mean function of the split-plot model is not linearizable is presented. This was done by fitting intrinsically nonlinear split-plot design (INSPD) models using the Chapman-Richards function. The fitted model parameters were estimated using estimated generalized least square (EGLS) techniques based on Gauss-Newton with Taylor series expansion by minimizing their respective objective functions. The variance components for the whole plot and subplot random effects are estimated using the restricted maximum likelihood estimation (REML) technique. The adequacy of the fitted INSPD model was tested using four median adequacy measures: resistant coefficient of determination, resistant prediction coefficient of determination, resistant modeling efficiency statistic, and median square error prediction statistic based on the residuals of the fitted models, which are influenced by the parameter estimation techniques being applied. Akaike's Information Criteria, Corrected Akaike's Information Criteria, and Bayesian Information criteria statistics were used to select the best parameter estimation technique. The results obtained showed that the Chapman-Richards SPD model via EGLS-REML fitted model is a good fit that is adequate, stable, and reliable for prediction compared to EGLS-MLE and OLS fitted models.
- Published
- 2023
- Full Text
- View/download PDF
44. Determining the number of factors in constrained factor models via Bayesian information criterion.
- Author
-
Xiang, Jingjie, Guo, Gangzheng, and Li, Jiaolong
- Subjects
- *
MONTE Carlo method , *HOME prices , *LEAST squares , *PARSIMONIOUS models - Abstract
This paper estimates the number of factors in constrained and partially constrained factor models (Tsai and Tsay, 2010) based on constrained Bayesian information criterion (CBIC). Following Bai and Ng (2002), the estimation of the number of factors depends on the tradeoff between good fit and parsimony, so we first derive the convergence rate of constrained factor estimates under the framework of large cross-sections (N) and large time dimensions (T). Furthermore, we demonstrate that the penalty for overfitting can be a function of N alone, so the BIC form, which does not work in the case of (unconstrained) approximate factor models, consistently estimates the number of factors in constrained factor models. We then conduct Monte Carlo simulations to show that our proposed CBIC has good finite sample performance and outperforms competing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. On the safe use of prior densities for Bayesian model selection.
- Author
-
Llorente, Fernando, Martino, Luca, Curbelo, Ernesto, López‐Santiago, Javier, and Delgado, David
- Subjects
- *
PARAMETER estimation , *BAYESIAN field theory , *DENSITY - Abstract
The application of Bayesian inference for the purpose of model selection is very popular nowadays. In this framework, models are compared through their marginal likelihoods, or their quotients, called Bayes factors. However, marginal likelihoods depend on the prior choice. For model selection, even diffuse priors can be actually very informative, unlike for the parameter estimation problem. Furthermore, when the prior is improper, the marginal likelihood of the corresponding model is undetermined. In this work, we discuss the issue of prior sensitivity of the marginal likelihood and its role in model selection. We also comment on the use of uninformative priors, which are very common choices in practice. Several practical suggestions are discussed and many possible solutions, proposed in the literature, to design objective priors for model selection are described. Some of them also allow the use of improper priors. The connection between the marginal likelihood approach and the well‐known information criteria is also presented. We describe the main issues and possible solutions by illustrative numerical examples, providing also some related code. One of them involving a real‐world application on exoplanet detection. This article is categorized under:Statistical Models > Bayesian ModelsStatistical Models > Fitting ModelsStatistical Models > Model Selection [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. A STUDY OF THE INFLUENCE OF ECONOMIC FACTORS ON WORLD SILVER PRODUCTION.
- Author
-
KUBESA, Jiří and ČERNÝ, Igor
- Subjects
ECONOMIC impact ,ECONOMETRIC models ,GROSS domestic product ,FACTORS of production ,ELECTRICAL energy ,SILVER - Abstract
Silver is a very important raw material used in industry, especially in the electrical and energy industries. With the increase in electromobility, its potential will grow in the future, and any shortage of silver on the world market could be a threat to modern industry. This article deals with the influence of economic factors (price, population, gross domestic product (GDP) and cumulative inflation) on silver production and the creation of appropriate econometric models that best express the relationship between production and economic factors for the period 2000-2020. The influence of economic factors on world silver production is examined using the coefficient of determination and information criteria. The authors use regression analysis in the article, especially these four types: linear, exponential, logarithmic and power. It is evident from the research that the best functional form of regression is exponential according to the coefficient of determination. Based on the investigated economic factors, it has been found that the price is unsuitable for the creation of econometrics; on the other hand, the other factors are eligible. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. Combination of a global-search method with model selection criteria for the ellipsometric data evaluation of DLC coatings.
- Author
-
Dorywalski, K., Lupicka, O., Grundmann, M., and Sturm, C.
- Subjects
PERMITTIVITY ,OPTICAL materials ,DIAMOND-like carbon ,OPTICAL constants ,SEARCH algorithms - Abstract
A method for the evaluation of experimental data from spectroscopic ellipsometry is proposed which combines the global-search optimization algorithm with statistical model selection criteria. The hybrid genetic-gradient search algorithm (HGGA) is applied to find the optical parameters and thickness of a diamond-like carbon (DLC) coating deposited on SW7M stainless steel. Akaike and Bayesian information criteria are used to evaluate the different dielectric function models. The method is able to find optical model parameters even in case of a limited initial knowledge about the material optical constants. At the same time, the optimal dielectric function model for the description of the material optical properties can be selected unambiguously from the set of candidate models. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
48. Application of minimum description length criterion to assess the complexity of models in mathematical immunology.
- Author
-
Grebennikov, Dmitry S., Zheltkova, Valerya V., and Bocharov, Gennady A.
- Subjects
- *
MATHEMATICAL models , *ORDINARY differential equations , *AKAIKE information criterion , *FISHER information , *IMMUNOLOGY , *GEOMETRIC modeling - Abstract
Mathematical models in immunology differ enormously in the dimensionality of the state space, the number of parameters and the parameterizations used to describe the immune processes. The ongoing diversification of the models needs to be complemented by rigorous ways to evaluate their complexity and select the parsimonious ones in relation to the data available/used for their calibration. A broadly applied metrics for ranking the models in mathematical immunology with respect to their complexity/parsimony is provided by the Akaike information criterion. In the present study, a computational framework is elaborated to characterize the complexity of mathematical models in immunology using a more general approach, namely, the Minimum Description Length criterion. It balances the model goodness-of-fit with the dimensionality and geometrical complexity of the model. Four representative models of the immune response to acute viral infection formulated with either ordinary or delay differential equations are studied. Essential numerical details enabling the assessment and ranking of the viral infection models include: (1) the optimization of the likelihood function, (2) the computation of the model sensitivity functions, (3) the evaluation of the Fisher information matrix and (4) the estimation of multidimensional integrals over the model parameter space. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
49. Gaussian Mixture Model-Based Clustering of Multivariate Data Using Soft Computing Hybrid Algorithm
- Author
-
Gögebakan, Maruf, Xhafa, Fatos, Series Editor, Hemanth, Jude, editor, Yigit, Tuncay, editor, Patrut, Bogdan, editor, and Angelopoulou, Anastassia, editor
- Published
- 2021
- Full Text
- View/download PDF
50. Model Selection for Explosive Models
- Author
-
Tao, Yubo and Yu, Jun
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.