16 results on '"R software"'
Search Results
2. Bayesian Information Criterion for Fitting the Optimum Order of Markov Chain Models: Methodology and Application to Air Pollution Data.
- Author
-
Alyousifi, Yousif, Ibrahim, Kamarulzaman, Othamn, Mahmod, Zin, Wan Zawiah Wan, Vergne, Nicolas, and Al-Yaari, Abdullah
- Subjects
- *
AIR quality indexes , *MARKOV processes , *AIR quality standards , *AKAIKE information criterion , *AIR analysis , *AIR quality , *STOCHASTIC processes , *AIR pollution - Abstract
The analysis of air pollution behavior is becoming crucial, where information on air pollution behavior is vital for managing air quality events. Many studies have described the stochastic behavior of air pollution based on the Markov chain (MC) models. Fitting the optimum order of MC models is essential for describing the stochastic process. However, uncertainty remains concerning the optimum order of such models for representing and characterizing air pollution index (API) data. In this study, the optimum order of the MC models for hourly and daily API sequences from seven stations in the central region of Peninsular Malaysia is identified, based on the Bayesian information criteria (BIC), contributing to exploring an adequate explanation of the probabilistic dependence of air pollution. A summary of the statistics for the API was calculated prior to the analysis. The Markov property and the divergence for the empirically estimated transition matrix of an MC sequence are also investigated. It is found from the analysis that the optimum order varies from one station to another. At most stations, for both observed and simulated API data, the second and third orders of the MC models are found to be optimum for hourly API occurrences, while the first-order MC is found to be most fitting for describing the dynamics of the daily API. Overall, fitting the optimum order of the MC model for the API data sequence captured the delay effect of air pollution. Accordingly, we concluded that the air quality standard lies within controllable limits, except for some infrequent occurrences of API values exceeding the unhealthy level. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
3. Vasicek Quantile and Mean Regression Models for Bounded Data: New Formulation, Mathematical Derivations, and Numerical Applications.
- Author
-
Mazucheli, Josmar, Alves, Bruna, Korkmaz, Mustafa Ç., and Leiva, Víctor
- Subjects
- *
QUANTILE regression , *REGRESSION analysis , *MONTE Carlo method , *MAXIMUM likelihood statistics , *BETA distribution , *DATA modeling , *WEIBULL distribution - Abstract
The Vasicek distribution is a two-parameter probability model with bounded support on the open unit interval. This distribution allows for different and flexible shapes and plays an important role in many statistical applications, especially for modeling default rates in the field of finance. Although its probability density function resembles some well-known distributions, such as the beta and Kumaraswamy models, the Vasicek distribution has not been considered to analyze data on the unit interval, especially when we have, in addition to a response variable, one or more covariates. In this paper, we propose to estimate quantiles or means, conditional on covariates, assuming that the response variable is Vasicek distributed. Through appropriate link functions, two Vasicek regression models for data on the unit interval are formulated: one considers a quantile parameterization and another one its original parameterization. Monte Carlo simulations are provided to assess the statistical properties of the maximum likelihood estimators, as well as the coverage probability. An R package developed by the authors, named vasicekreg, makes available the results of the present investigation. Applications with two real data sets are conducted for illustrative purposes: in one of them, the unit Vasicek quantile regression outperforms the models based on the Johnson-SB, Kumaraswamy, unit-logistic, and unit-Weibull distributions, whereas in the second one, the unit Vasicek mean regression outperforms the fits obtained by the beta and simplex distributions. Our investigation suggests that unit Vasicek quantile and mean regressions can be of practical usage as alternatives to some well-known models for analyzing data on the unit interval. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
4. A Study on Computational Algorithms in the Estimation of Parameters for a Class of Beta Regression Models.
- Author
-
Couri, Lucas, Ospina, Raydonal, Silva, Geiza da, Leiva, Víctor, and Figueroa-Zúñiga, Jorge
- Subjects
- *
PARAMETER estimation , *MONTE Carlo method , *REGRESSION analysis , *ALGORITHMS , *COMPUTATIONAL statistics , *SIMULATED annealing - Abstract
Beta regressions describe the relationship between a response that assumes values in the zero-one range and covariates. These regressions are used for modeling rates, ratios, and proportions. We study computational aspects related to parameter estimation of a class of beta regressions for the mean with fixed precision by maximizing the log-likelihood function with heuristics and other optimization methods. Through Monte Carlo simulations, we analyze the behavior of ten algorithms, where four of them present satisfactory results. These are the differential evolutionary, simulated annealing, stochastic ranking evolutionary, and controlled random search algorithms, with the latter one having the best performance. Using the four algorithms and the optim function of R, we study sets of parameters that are hard to be estimated. We detect that this function fails in most cases, but when it is successful, it is more accurate and faster than the others. The annealing algorithm obtains satisfactory estimates in viable time with few failures so that we recommend its use when the optim function fails. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
5. A New Quantile Regression Model and Its Diagnostic Analytics for a Weibull Distributed Response with Applications.
- Author
-
Sánchez, Luis, Leiva, Víctor, Saulo, Helton, Marchant, Carolina, and Sarabia, José M.
- Subjects
- *
QUANTILE regression , *MONTE Carlo method , *REGRESSION analysis , *MAXIMUM likelihood statistics , *WEIBULL distribution , *CONDITIONED response - Abstract
Standard regression models focus on the mean response based on covariates. Quantile regression describes the quantile for a response conditioned to values of covariates. The relevance of quantile regression is even greater when the response follows an asymmetrical distribution. This relevance is because the mean is not a good centrality measure to resume asymmetrically distributed data. In such a scenario, the median is a better measure of the central tendency. Quantile regression, which includes median modeling, is a better alternative to describe asymmetrically distributed data. The Weibull distribution is asymmetrical, has positive support, and has been extensively studied. In this work, we propose a new approach to quantile regression based on the Weibull distribution parameterized by its quantiles. We estimate the model parameters using the maximum likelihood method, discuss their asymptotic properties, and develop hypothesis tests. Two types of residuals are presented to evaluate the model fitting to data. We conduct Monte Carlo simulations to assess the performance of the maximum likelihood estimators and residuals. Local influence techniques are also derived to analyze the impact of perturbations on the estimated parameters, allowing us to detect potentially influential observations. We apply the obtained results to a real-world data set to show how helpful this type of quantile regression model is. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
6. A New Algorithm for Computing Disjoint Orthogonal Components in the Parallel Factor Analysis Model with Simulations and Applications to Real-World Data.
- Author
-
Martin-Barreiro, Carlos, Ramirez-Figueroa, John A., Cabezas, Xavier, Leiva, Victor, Martin-Casado, Ana, and Galindo-Villardón, M. Purificación
- Subjects
- *
FACTOR analysis , *ALGORITHMS , *HEURISTIC algorithms , *SIMULATION methods & models , *PARALLEL algorithms , *MATRICES (Mathematics) - Abstract
In this paper, we extend the use of disjoint orthogonal components to three-way table analysis with the parallel factor analysis model. Traditional methods, such as scaling, orthogonality constraints, non-negativity constraints, and sparse techniques, do not guarantee that interpretable loading matrices are obtained in this model. We propose a novel heuristic algorithm that allows simple structure loading matrices to be obtained by calculating disjoint orthogonal components. This algorithm is also an alternative approach for solving the well-known degeneracy problem. We carry out computational experiments by utilizing simulated and real-world data to illustrate the benefits of the proposed algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
7. Logistic Biplot by Conjugate Gradient Algorithms and Iterated SVD.
- Author
-
Babativa-Márquez, Jose Giovany and Vicente-Villardón, José Luis
- Subjects
- *
ALGORITHMS , *PRINCIPAL components analysis , *SINGULAR value decomposition - Abstract
Multivariate binary data are increasingly frequent in practice. Although some adaptations of principal component analysis are used to reduce dimensionality for this kind of data, none of them provide a simultaneous representation of rows and columns (biplot). Recently, a technique named logistic biplot (LB) has been developed to represent the rows and columns of a binary data matrix simultaneously, even though the algorithm used to fit the parameters is too computationally demanding to be useful in the presence of sparsity or when the matrix is large. We propose the fitting of an LB model using nonlinear conjugate gradient (CG) or majorization–minimization (MM) algorithms, and a cross-validation procedure is introduced to select the hyperparameter that represents the number of dimensions in the model. A Monte Carlo study that considers scenarios with several sparsity levels and different dimensions of the binary data set shows that the procedure based on cross-validation is successful in the selection of the model for all algorithms studied. The comparison of the running times shows that the CG algorithm is more efficient in the presence of sparsity and when the matrix is not very large, while the performance of the MM algorithm is better when the binary matrix is balanced or large. As a complement to the proposed methods and to give practical support, a package has been written in the R language called BiplotML. To complete the study, real binary data on gene expression methylation are used to illustrate the proposed methods. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
8. A New Birnbaum–Saunders Distribution and Its Mathematical Features Applied to Bimodal Real-World Data from Environment and Medicine.
- Author
-
Reyes, Jimmy, Arrué, Jaime, Leiva, Víctor, and Martin-Barreiro, Carlos
- Subjects
- *
MONTE Carlo method , *ALGORITHMS , *DISTRIBUTION (Probability theory) , *RANDOM variables , *MAXIMUM likelihood statistics - Abstract
In this paper, we propose and derive a Birnbaum–Saunders distribution to model bimodal data. This new distribution is obtained using the product of the standard Birnbaum–Saunders distribution and a polynomial function of the fourth degree. We study the mathematical and statistical properties of the bimodal Birnbaum–Saunders distribution, including probabilistic features and moments. Inference on its parameters is conducted using the estimation methods of moments and maximum likelihood. Based on the acceptance–rejection criterion, an algorithm is proposed to generate values of a random variable that follows the new bimodal Birnbaum–Saunders distribution. We carry out a simulation study using the Monte Carlo method to assess the statistical performance of the parameter estimators. Illustrations with real-world data sets from environmental and medical sciences are provided to show applications that can be of potential use in real problems. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
9. Sparse HJ Biplot: A New Methodology via Elastic Net.
- Author
-
Cubilla-Montilla, Mitzi, Nieto-Librero, Ana Belén, Galindo-Villardón, M. Purificación, and Torres-Cubilla, Carlos A.
- Subjects
- *
MULTIVARIATE analysis , *BREAST cancer , *PETRI nets , *HIGH-dimensional model representation - Abstract
The HJ biplot is a multivariate analysis technique that allows us to represent both individuals and variables in a space of reduced dimensions. To adapt this approach to massive datasets, it is necessary to implement new techniques that are capable of reducing the dimensionality of the data and improving interpretation. Because of this, we propose a modern approach to obtaining the HJ biplot called the elastic net HJ biplot, which applies the elastic net penalty to improve the interpretation of the results. It is a novel algorithm in the sense that it is the first attempt within the biplot family in which regularisation methods are used to obtain modified loadings to optimise the results. As a complement to the proposed method, and to give practical support to it, a package has been developed in the R language called SparseBiplots. This package fills a gap that exists in the context of the HJ biplot through penalized techniques since in addition to the elastic net, it also includes the ridge and lasso to obtain the HJ biplot. To complete the study, a practical comparison is made with the standard HJ biplot and the disjoint biplot, and some results common to these methods are analysed. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
10. Predicting PM2.5 and PM10 Levels during Critical Episodes Management in Santiago, Chile, with a Bivariate Birnbaum-Saunders Log-Linear Model.
- Author
-
Puentes, Rodrigo, Marchant, Carolina, Leiva, Víctor, Figueroa-Zúñiga, Jorge I., Ruggeri, Fabrizio, and D'Urso, Pierpaolo
- Subjects
- *
LOG-linear models , *AIR quality , *PARTICULATE matter , *NATURAL resources , *OUTLIER detection , *PREDICTION models , *FORECASTING , *BIVARIATE analysis - Abstract
Improving air quality is an important environmental challenge of our time. Chile currently has one of the most stable and emerging economies in Latin America, where human impact on natural resources and air quality does not go unperceived. Santiago, the capital of Chile, is one of the cities in which particulate matter (PM) levels exceed national and international limits. Its location and climate cause critical conditions for human health when interaction with anthropogenic emissions is present. In this paper, we propose a predictive model based on bivariate regression to estimate PM levels, related to PM2.5 and PM10, simultaneously. Birnbaum-Saunders distributions are used in the joint modeling of real-world PM2.5 and PM10 data by considering as covariates some relevant meteorological variables employed in similar studies. The Mahalanobis distance is utilized to assess bivariate outliers and to detect suitability of the distributional assumption. In addition, we use the local influence technique for analyzing the impact of a perturbation on the overall estimation of model parameters. In the predictions, we check the categorization for the observed and predicted cases of the model according to the primary air quality regulations for PM. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
11. A New Algorithm for Computing Disjoint Orthogonal Components in the Three-Way Tucker Model.
- Author
-
Martin-Barreiro, Carlos, Ramirez-Figueroa, John A., Nieto-Librero, Ana B., Leiva, Víctor, Martin-Casado, Ana, and Galindo-Villardón, M. Purificación
- Subjects
- *
ALGORITHMS , *SINGULAR value decomposition , *GREEDY algorithms - Abstract
One of the main drawbacks of the traditional methods for computing components in the three-way Tucker model is the complex structure of the final loading matrices preventing an easy interpretation of the obtained results. In this paper, we propose a heuristic algorithm for computing disjoint orthogonal components facilitating the analysis of three-way data and the interpretation of results. We observe in the computational experiments carried out that our novel algorithm ameliorates this drawback, generating final loading matrices with a simple structure and then easier to interpret. Illustrations with real data are provided to show potential applications of the algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
12. Sign, Wilcoxon and Mann-Whitney Tests for Functional Data: An Approach Based on Random Projections.
- Author
-
Meléndez, Rafael, Giraldo, Ramón, and Leiva, Víctor
- Subjects
- *
MANN Whitney U Test , *DATABASES , *GAUSSIAN processes , *STOCHASTIC processes , *FUNCTIONAL analysis , *MONTE Carlo method - Abstract
Sign, Wilcoxon and Mann-Whitney tests are nonparametric methods in one or two-sample problems. The nonparametric methods are alternatives used for testing hypothesis when the standard methods based on the Gaussianity assumption are not suitable to be applied. Recently, the functional data analysis (FDA) has gained relevance in statistical modeling. In FDA, each observation is a curve or function which usually is a realization of a stochastic process. In the literature of FDA, several methods have been proposed for testing hypothesis with samples coming from Gaussian processes. However, when this assumption is not realistic, it is necessary to utilize other approaches. Clustering and regression methods, among others, for non-Gaussian functional data have been proposed recently. In this paper, we propose extensions of the sign, Wilcoxon and Mann-Whitney tests to the functional data context as methods for testing hypothesis when we have one or two samples of non-Gaussian functional data. We use random projections to transform the functional problem into a scalar one, and then we proceed as in the standard case. Based on a simulation study, we show that the proposed tests have a good performance. We illustrate the methodology by applying it to a real data set. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
13. Data-Influence Analytics in Predictive Models Applied to Asthma Disease.
- Author
-
Tapia, Alejandra, Giampaoli, Viviana, Leiva, Víctor, and Lio, Yuhlong
- Subjects
- *
PREDICTION models , *ASTHMA , *ASTHMA in children , *PUBLIC hospitals , *REGRESSION analysis , *MEDICAL sciences - Abstract
Asthma is one of the most common chronic diseases around the world and represents a serious problem in human health. Predictive models have become important in medical sciences because they provide valuable information for data-driven decision-making. In this work, a methodology of data-influence analytics based on mixed-effects logistic regression models is proposed for detecting potentially influential observations which can affect the quality of these models. Global and local influence diagnostic techniques are used simultaneously in this detection, which are often used separately. In addition, predictive performance measures are considered for this analytics. A study with children and adolescent asthma real data, collected from a public hospital of São Paulo, Brazil, is conducted to illustrate the proposed methodology. The results show that the influence diagnostic methodology is helpful for obtaining an accurate predictive model that provides scientific evidence when data-driven medical decision-making. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
14. Cokriging Prediction Using as Secondary Variable a Functional Random Field with Application in Environmental Pollution.
- Author
-
Giraldo, Ramón, Herrera, Luis, and Leiva, Víctor
- Subjects
- *
FORECASTING , *POLLUTION , *RANDOM variables , *PARTICULATE matter , *WIND speed , *RANDOM fields , *MARKOV random fields - Abstract
Cokriging is a geostatistical technique that is used for spatial prediction when realizations of a random field are available. If a secondary variable is cross-correlated with the primary variable, both variables may be employed for prediction by means of cokriging. In this work, we propose a predictive model that is based on cokriging when the secondary variable is functional. As in the ordinary cokriging, a co-regionalized linear model is needed in order to estimate the corresponding auto-correlations and cross-correlations. The proposed model is utilized for predicting the environmental pollution of particulate matter when considering wind speed curves as functional secondary variable. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
15. Robust Three-Step Regression Based on Comedian and Its Performance in Cell-Wise and Case-Wise Outliers.
- Author
-
Velasco, Henry, Laniado, Henry, Toro, Mauricio, Leiva, Víctor, and Lio, Yuhlong
- Subjects
- *
COMEDIANS , *OUTLIERS (Statistics) , *MONTE Carlo method , *REGRESSION analysis - Abstract
Both cell-wise and case-wise outliers may appear in a real data set at the same time. Few methods have been developed in order to deal with both types of outliers when formulating a regression model. In this work, a robust estimator is proposed based on a three-step method named 3S-regression, which uses the comedian as a highly robust scatter estimate. An intensive simulation study is conducted in order to evaluate the performance of the proposed comedian 3S-regression estimator in the presence of cell-wise and case-wise outliers. In addition, a comparison of this estimator with recently developed robust methods is carried out. The proposed method is also extended to the model with continuous and dummy covariates. Finally, a real data set is analyzed for illustration in order to show potential applications. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
16. Birnbaum-Saunders Quantile Regression Models with Application to Spatial Data.
- Author
-
Sánchez, Luis, Leiva, Víctor, Galea, Manuel, and Saulo, Helton
- Subjects
- *
QUANTILE regression , *REGRESSION analysis , *MARGINAL distributions , *MULTIVARIATE analysis , *MAXIMUM likelihood statistics , *DATA - Abstract
In the present paper, a novel spatial quantile regression model based on the Birnbaum–Saunders distribution is formulated. This distribution has been widely studied and applied in many fields. To formulate such a spatial model, a parameterization of the multivariate Birnbaum–Saunders distribution, where one of its parameters is associated with the quantile of the respective marginal distribution, is established. The model parameters are estimated by the maximum likelihood method. Finally, a data set is applied for illustrating the formulated model. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.