Back to Search
Start Over
A review on machine learning principles for multi-view biological data integration
- Source :
- Briefings in bioinformatics. 19(2)
- Publication Year :
- 2016
-
Abstract
- Driven by high-throughput sequencing techniques, modern genomic and clinical studies are in a strong need of integrative machine learning models for better use of vast volumes of heterogeneous information in the deep understanding of biological systems and the development of predictive models. How data from multiple sources (called multi-view data) are incorporated in a learning system is a key step for successful analysis. In this article, we provide a comprehensive review on omics and clinical data integration techniques, from a machine learning perspective, for various analyses such as prediction, clustering, dimension reduction and association. We shall show that Bayesian models are able to use prior information and model measurements with various distributions; tree-based methods can either build a tree with all features or collectively make a final decision based on trees learned from each view; kernel methods fuse the similarity matrices learned from individual views together for a final similarity matrix or learning model; network-based fusion methods are capable of inferring direct and indirect associations in a heterogeneous network; matrix factorization models have potential to learn interactions among features from different views; and a range of deep neural networks can be integrated in multi-modal learning for capturing the complex mechanism of biological systems.
- Subjects :
- 0301 basic medicine
Computer science
multi-omics data
02 engineering and technology
computer.software_genre
Machine learning
matrix factorization
Models, Biological
Matrix decomposition
Machine Learning
03 medical and health sciences
0202 electrical engineering, electronic engineering, information engineering
Animals
Humans
Gene Regulatory Networks
Cluster analysis
multiple kernel learning
Molecular Biology
data integration
Biological data
business.industry
Dimensionality reduction
Systems Biology
deep learning
Tree (data structure)
030104 developmental biology
Kernel method
network fusion
020201 artificial intelligence & image processing
Artificial intelligence
Data mining
business
computer
Heterogeneous network
random forest
Information Systems
Data integration
Subjects
Details
- ISSN :
- 14774054
- Volume :
- 19
- Issue :
- 2
- Database :
- OpenAIRE
- Journal :
- Briefings in bioinformatics
- Accession number :
- edsair.doi.dedup.....4b8b16b6fe7244ae5c202fedd9c7d4a3