Back to Search Start Over

Agglomerative joint clustering of metabolic data with spike at zero: A Bayesian perspective

Authors :
Vahid Partovi Nia
Mostafa Ghannad-Rezaie
Source :
Biometrical Journal. 58:387-396
Publication Year :
2015
Publisher :
Wiley, 2015.

Abstract

In many biological applications, for example high-dimensional metabolic data, the measurements consist of several continuous measurements of subjects or tissues over multiple attributes or metabolites. Measurement values are put in a matrix with subjects in rows and attributes in columns. The analysis of such data requires grouping subjects and attributes to provide a primitive guide toward data modeling. A common approach is to group subjects and attributes separately, and construct a two-dimensional dendrogram tree, once on rows and then on columns. This simple approach provides a grouping visualization through two separate trees, which is difficult to interpret jointly. When a joint grouping of rows and columns is of interest, it is more natural to partition the data matrix directly. Our suggestion is to build a dendrogram on the matrix directly, thus generalizing the two-dimensional dendrogram tree to a three-dimensional forest. The contribution of this research to the statistical analysis of metabolic data is threefold. First, a novel spike-and-slab model in various hierarchies is proposed to identify discriminant rows and columns. Second, an agglomerative approach is suggested to organize joint clusters. Third, a new visualization tool is invented to demonstrate the collection of joint clusters. The new method is motivated over gas chromatography mass spectrometry (GCMS) metabolic data, but can be applied to other continuous measurements with spike at zero property.

Details

ISSN :
03233847
Volume :
58
Database :
OpenAIRE
Journal :
Biometrical Journal
Accession number :
edsair.doi...........6603ffe725617d950eb41bf09b3b7c6a
Full Text :
https://doi.org/10.1002/bimj.201400110