Back to Search Start Over

Estimating the number of components and detecting outliers using Angle Distribution of Loading Subspaces (ADLS) in PCA analysis.

Authors :
Liu YJ
Tran T
Postma G
Buydens LMC
Jansen J
Source :
Analytica chimica acta [Anal Chim Acta] 2018 Aug 22; Vol. 1020, pp. 17-29. Date of Electronic Publication: 2018 Mar 29.
Publication Year :
2018

Abstract

Principal Component Analysis (PCA) is widely used in analytical chemistry, to reduce the dimensionality of a multivariate data set in a few Principal Components (PCs) that summarize the predominant patterns in the data. An accurate estimate of the number of PCs is indispensable to provide meaningful interpretations and extract useful information. We show how existing estimates for the number of PCs may fall short for datasets with considerable coherence, noise or outlier presence. We present here how Angle Distribution of the Loading Subspaces (ADLS) can be used to estimate the number of PCs based on the variability of loading subspace across bootstrap resamples. Based on comprehensive comparisons with other well-known methods applied on simulated dataset, we show that ADLS (1) may quantify the stability of a PCA model with several numbers of PCs simultaneously; (2) better estimate the appropriate number of PCs when compared with the cross-validation and scree plot methods, specifically for coherent data, and (3) facilitate integrated outlier detection, which we introduce in this manuscript. We, in addition, demonstrate how the analysis of different types of real-life spectroscopic datasets may benefit from these advantages of ADLS.<br /> (Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.)

Details

Language :
English
ISSN :
1873-4324
Volume :
1020
Database :
MEDLINE
Journal :
Analytica chimica acta
Publication Type :
Academic Journal
Accession number :
29655425
Full Text :
https://doi.org/10.1016/j.aca.2018.03.044