Start Over

A comparison of graph- and kernel-based –omics data integration algorithms for classifying complex traits

Authors :: Kang K. Yan
Hongyu Zhao
Herbert Pang
Source :: BMC Bioinformatics, Vol 18, Iss 1, Pp 1-13 (2017)
Publication Year :: 2017
Publisher :: BMC, 2017.
Abstract: Abstract Background High-throughput sequencing data are widely collected and analyzed in the study of complex diseases in quest of improving human health. Well-studied algorithms mostly deal with single data source, and cannot fully utilize the potential of these multi-omics data sources. In order to provide a holistic understanding of human health and diseases, it is necessary to integrate multiple data sources. Several algorithms have been proposed so far, however, a comprehensive comparison of data integration algorithms for classification of binary traits is currently lacking. Results In this paper, we focus on two common classes of integration algorithms, graph-based that depict relationships with subjects denoted by nodes and relationships denoted by edges, and kernel-based that can generate a classifier in feature space. Our paper provides a comprehensive comparison of their performance in terms of various measurements of classification accuracy and computation time. Seven different integration algorithms, including graph-based semi-supervised learning, graph sharpening integration, composite association network, Bayesian network, semi-definite programming-support vector machine (SDP-SVM), relevance vector machine (RVM) and Ada-boost relevance vector machine are compared and evaluated with hypertension and two cancer data sets in our study. In general, kernel-based algorithms create more complex models and require longer computation time, but they tend to perform better than graph-based algorithms. The performance of graph-based algorithms has the advantage of being faster computationally. Conclusions The empirical results demonstrate that composite association network, relevance vector machine, and Ada-boost RVM are the better performers. We provide recommendations on how to choose an appropriate algorithm for integrating data from multiple sources.

Subjects :: Bayesian network
Relevance vector machine
Graph-based semi-supervised learning
Semi-definite programming (SDP)-support vector machine
Multiple data sources
Classification
Computer applications to medicine. Medical informatics
R858-859.7
Biology (General)
QH301-705.5

Details

Language :: English
ISSN :: 14712105
Volume :: 18
Issue :: 1
Database :: Directory of Open Access Journals
Journal :: BMC Bioinformatics
Publication Type :: Academic Journal
Accession number :: edsdoj.846269b1a4d6c83a374ff9031a82a
Document Type :: article
Full Text :: https://doi.org/10.1186/s12859-017-1982-4

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

A comparison of graph- and kernel-based –omics data integration algorithms for classifying complex traits

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

A comparison of graph- and kernel-based –omics data integration algorithms for classifying complex traits

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources