Back to Search Start Over

Automatic aggregation of categories in multivariate contingency tables using information theory

Authors :
A. Gallego Segador
J.M. Caridad y Ocerín
R. Espejo Mohedano
Source :
Computational Statistics & Data Analysis. 29:285-294
Publication Year :
1999
Publisher :
Elsevier BV, 1999.

Abstract

Low expected frequencies in tests associated to log-linear models building are treated with the aim of providing a methodology, useful for nonstatistician users, to analyse multivariate contingency tables. A procedure that reproduces the decisions of a statistical analyst studying a multivariate contingency table and confronted with low expected frequencies is provided, using the Bayesian information criterion to select a variable over which the aggregation should be done, and the entropy of Shannon to decide which categories should be aggregated. Prior opinions and knowledge about the feasibility of aggregation of categories within the context where the data have been collected are included in the system. The procedure has some user friendly techniques oriented to nonstatisticians, and it allowed greater efficiency when there are several multivariate tables to be analysed using some variables that can be included in different log-linear models.

Details

ISSN :
01679473
Volume :
29
Database :
OpenAIRE
Journal :
Computational Statistics & Data Analysis
Accession number :
edsair.doi...........9d238ed2a24f8884123c66d795e7f52e