1. Multivariate classification of constrained data: problems and alternatives
- Author
-
Roberto Aruga
- Subjects
Multivariate statistics ,Trace (linear algebra) ,Multivariate analysis ,Chemistry ,Minor (linear algebra) ,Biochemistry ,Analytical Chemistry ,Set (abstract data type) ,Statistics ,Principal component analysis ,Environmental Chemistry ,Compositional data ,Raw data ,Algorithm ,Spectroscopy - Abstract
The problems relating to multivariate classifications carried out on matrices of constrained data are examined with reference both to row-sum constraints (closed, or compositional data) and to constraints concerning the ratio between variables (radial, or V-shaped data). As regards the use of principal component analysis (PCA) with closed data, the two opposite drawbacks observed previously with raw data and after a log row centering (or Aitchison's transform) are confirmed. It is demonstrated, in particular, that classifications based on raw closed data give too much weight to major variables, while those based on log row centered data to minor and trace variables. In consideration of this, a ‘unified’ procedure is proposed, which simultaneously processes with principal component analysis the two kinds of data above. Such a procedure seems to obviate the cited drawbacks and to give correct classifications. These results have been obtained using both simulated and real data, the latter referring to a set of archaeological glass finds. The problem of the influence of responses below the detection limit on the classifications is also examined, together with some aspects relating to the classification of radial data.
- Published
- 2004
- Full Text
- View/download PDF