Back to Search Start Over

Toward Scalable Machine Learning and Data Mining: the Bioinformatics Case

Authors :
Faghri, Faraz
Hashemi, Sayed Hadi
Babaeizadeh, Mohammad
Nalls, Mike A.
Sinha, Saurabh
Campbell, Roy H.
Publication Year :
2017

Abstract

In an effort to overcome the data deluge in computational biology and bioinformatics and to facilitate bioinformatics research in the era of big data, we identify some of the most influential algorithms that have been widely used in the bioinformatics community. These top data mining and machine learning algorithms cover classification, clustering, regression, graphical model-based learning, and dimensionality reduction. The goal of this study is to guide the focus of scalable computing experts in the endeavor of applying new storage and scalable computation designs to bioinformatics algorithms that merit their attention most, following the engineering maxim of "optimize the common case".

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.1710.00112
Document Type :
Working Paper