Back to Search
Start Over
Agglomerative joint clustering of metabolic data with spike at zero: A Bayesian perspective
- Source :
- Biometrical Journal. 58:387-396
- Publication Year :
- 2015
- Publisher :
- Wiley, 2015.
-
Abstract
- In many biological applications, for example high-dimensional metabolic data, the measurements consist of several continuous measurements of subjects or tissues over multiple attributes or metabolites. Measurement values are put in a matrix with subjects in rows and attributes in columns. The analysis of such data requires grouping subjects and attributes to provide a primitive guide toward data modeling. A common approach is to group subjects and attributes separately, and construct a two-dimensional dendrogram tree, once on rows and then on columns. This simple approach provides a grouping visualization through two separate trees, which is difficult to interpret jointly. When a joint grouping of rows and columns is of interest, it is more natural to partition the data matrix directly. Our suggestion is to build a dendrogram on the matrix directly, thus generalizing the two-dimensional dendrogram tree to a three-dimensional forest. The contribution of this research to the statistical analysis of metabolic data is threefold. First, a novel spike-and-slab model in various hierarchies is proposed to identify discriminant rows and columns. Second, an agglomerative approach is suggested to organize joint clusters. Third, a new visualization tool is invented to demonstrate the collection of joint clusters. The new method is motivated over gas chromatography mass spectrometry (GCMS) metabolic data, but can be applied to other continuous measurements with spike at zero property.
- Subjects :
- 0301 basic medicine
Statistics and Probability
Dendrogram
General Medicine
computer.software_genre
01 natural sciences
Complete-linkage clustering
Data matrix (multivariate statistics)
Data modeling
Hierarchical clustering
010104 statistics & probability
03 medical and health sciences
Tree (data structure)
030104 developmental biology
Data mining
0101 mathematics
Statistics, Probability and Uncertainty
Cluster analysis
computer
Row
Mathematics
Subjects
Details
- ISSN :
- 03233847
- Volume :
- 58
- Database :
- OpenAIRE
- Journal :
- Biometrical Journal
- Accession number :
- edsair.doi...........6603ffe725617d950eb41bf09b3b7c6a
- Full Text :
- https://doi.org/10.1002/bimj.201400110