Back to Search Start Over

Mining features for biomedical data using clustering tree ensembles.

Authors :
Pliakos K
Vens C
Source :
Journal of biomedical informatics [J Biomed Inform] 2018 Sep; Vol. 85, pp. 40-48. Date of Electronic Publication: 2018 Jul 29.
Publication Year :
2018

Abstract

The volume of biomedical data available to the machine learning community grows very rapidly. A rational question is how informative these data really are or how discriminant the features describing the data instances are. Several biomedical datasets suffer from lack of variance in the instance representation, or even worse, contain instances with identical features and different class labels. Indisputably, this directly affects the performance of machine learning algorithms, as well as the ability to interpret their results. In this article, we emphasize on the aforementioned problem and propose a target-informed feature induction method based on tree ensemble learning. The method brings more variance into the data representation, thereby potentially increasing predictive performance of a learner applied to the induced features. The contribution of this article is twofold. Firstly, a problem affecting the quality of biomedical data is highlighted, and secondly, a method to handle that problem is proposed. The efficiency of the presented approach is validated on multi-target prediction tasks. The obtained results indicate that the proposed approach is able to boost the discrimination between the data instances and increase the predictive performance.<br /> (Copyright © 2018 Elsevier Inc. All rights reserved.)

Details

Language :
English
ISSN :
1532-0480
Volume :
85
Database :
MEDLINE
Journal :
Journal of biomedical informatics
Publication Type :
Academic Journal
Accession number :
30012356
Full Text :
https://doi.org/10.1016/j.jbi.2018.07.012