Back to Search Start Over

Taxicab Correspondence Analysis of Sparse Contingency Tables

Authors :
Choulakian, Vartan
Publication Year :
2015

Abstract

Visualization and interpretation of contingency tables by correspondence analysis (CA), as developed by Benzecri, has a rich structure based on Euclidean geometry. However, it is a well established fact that, often CA is very sensitive to sparse contingency tables, where we caracterize sparsity as the existence of relatively high-valued counts, rare observations discussed by Rao (1995), and zero-block structure emphasized by Novak and Bar-Hen (2005) and Greenacre (2013). In this paper, we aim to emphasize the important roles played by L1 and L2 geometries. This will be done by comparing the maps obtained by CA with the maps obtained by taxicab correspondence analysis (TCA), where TCA is a robust L1 variant of correspondence analysis. If the projections of view of both maps are quite different, we refer to this phenomenon as parallax. In astronomy, parallax means the apparent change in the position of an object as seen from two different points. In our case the two different points correspond to the two different geometries, Euclidean and Taxicab. The existence of a parallax highlights the important, but hidden, role of the underlying geometry in the interpretation of the maps obtained in multivariate data analysis. We emphasize the following fact: Only by comparing CA and TCA graphical displays, we are able to reveal the phenomenon of parallax. Examples are provided.<br />Comment: 23 pages, 4 figures

Subjects

Subjects :
Statistics - Applications
62H25

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.1508.06885
Document Type :
Working Paper