1. Overcoming the Spurious Groups Problem in Between-Group PCA.
- Author
-
Thioulouse, Jean, Renaud, Sabrina, Dufour, Anne-Béatrice, and Dray, Stéphane
- Abstract
Several papers have recently raised the occurrence of some problems with between-group Principal Component Analysis (bgPCA). This method inflates the differences between the groups, and can even display completely artificial differences when none exist, for example when applied to random numbers tables with many variables (columns) and few individuals (rows). Lately, cross-validation has been proposed as a way to circumvent this problem. Here we present some tools and several functions of the ade4 package for the R statistical software to compute a bgPCA, test the presence of statistically significant groups, perform a cross-validation of this analysis and compute associated statistics. We also describe how to use these functions to avoid running into the spurious groups problem. Several examples, including a real data set and random numbers tables, are used to validate this approach in various experimental and numerical conditions. The integrated framework of the duality diagram, as implemented in ade4, allows to extend this approach to other multivariate analysis methods beyond principal component analysis, which could prove useful in the case of other types of variables. The R code and the real data table used to make the computations and graphs of this paper are available as supplementary material. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF