Back to Search Start Over

Estimating the number of components and detecting outliers using Angle Distribution of Loading Subspaces (ADLS) in PCA analysis.

Authors :
Liu, Y.J.
Tran, T.
Postma, G.
Buydens, L.M.C.
Jansen, J.
Source :
Analytica Chimica Acta. Aug2018, Vol. 1020, p17-29. 13p.
Publication Year :
2018

Abstract

Principal Component Analysis (PCA) is widely used in analytical chemistry, to reduce the dimensionality of a multivariate data set in a few Principal Components (PCs) that summarize the predominant patterns in the data. An accurate estimate of the number of PCs is indispensable to provide meaningful interpretations and extract useful information. We show how existing estimates for the number of PCs may fall short for datasets with considerable coherence, noise or outlier presence. We present here how Angle Distribution of the Loading Subspaces (ADLS) can be used to estimate the number of PCs based on the variability of loading subspace across bootstrap resamples. Based on comprehensive comparisons with other well-known methods applied on simulated dataset, we show that ADLS (1) may quantify the stability of a PCA model with several numbers of PCs simultaneously; (2) better estimate the appropriate number of PCs when compared with the cross-validation and scree plot methods, specifically for coherent data, and (3) facilitate integrated outlier detection, which we introduce in this manuscript. We, in addition, demonstrate how the analysis of different types of real-life spectroscopic datasets may benefit from these advantages of ADLS. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00032670
Volume :
1020
Database :
Academic Search Index
Journal :
Analytica Chimica Acta
Publication Type :
Academic Journal
Accession number :
129008038
Full Text :
https://doi.org/10.1016/j.aca.2018.03.044