1. Evaluation of integrative clustering methods for the analysis of multi-omics data
- Author
-
Cécile Chauvel, Frédéric Reynier, Alexei Novoloaca, Pierre Veyre, and Jérémie Becker
- Subjects
Proteomics ,Computer science ,0206 medical engineering ,Bayesian probability ,Breast Neoplasms ,02 engineering and technology ,computer.software_genre ,Matrix decomposition ,Set (abstract data type) ,03 medical and health sciences ,Cluster Analysis ,Humans ,Cluster analysis ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,Experimental data ,Bayes Theorem ,Genomics ,Data set ,Benchmark (computing) ,Data mining ,computer ,020602 bioinformatics ,Unsupervised Machine Learning ,Information Systems ,Data integration - Abstract
Recent advances in sequencing, mass spectrometry and cytometry technologies have enabled researchers to collect large-scale omics data from the same set of biological samples. The joint analysis of multiple omics offers the opportunity to uncover coordinated cellular processes acting across different omic layers. In this work, we present a thorough comparison of a selection of recent integrative clustering approaches, including Bayesian (BCC and MDI) and matrix factorization approaches (iCluster, moCluster, JIVE and iNMF). Based on simulations, the methods were evaluated on their sensitivity and their ability to recover both the correct number of clusters and the simulated clustering at the common and data-specific levels. Standard non-integrative approaches were also included to quantify the added value of integrative methods. For most matrix factorization methods and one Bayesian approach (BCC), the shared and specific structures were successfully recovered with high and moderate accuracy, respectively. An opposite behavior was observed on non-integrative approaches, i.e. high performances on specific structures only. Finally, we applied the methods on the Cancer Genome Atlas breast cancer data set to check whether results based on experimental data were consistent with those obtained in the simulations.
- Published
- 2019