Back to Search Start Over

Interpretable Ensembles of Classifiers for Uncertain Data With Bioinformatics Applications.

Authors :
Maia MRH
Plastino A
Freitas A
de Magalhaes JP
Source :
IEEE/ACM transactions on computational biology and bioinformatics [IEEE/ACM Trans Comput Biol Bioinform] 2023 May-Jun; Vol. 20 (3), pp. 1829-1841. Date of Electronic Publication: 2023 Jun 05.
Publication Year :
2023

Abstract

Data uncertainty remains a challenging issue in many applications, but few classification algorithms can effectively cope with it. An ensemble approach for uncertain categorical features has recently been proposed, achieving promising results. It consists in biasing the sampling of features for each model in an ensemble so that less uncertain features are more likely to be sampled. Here we extend this idea of biased sampling and propose two new approaches: one for selecting training instances for each model in an ensemble and another for sampling features to be considered when splitting a node in a Random Forest training. We applied these approaches to classify ageing-related genes and predict drugs' side effects based on uncertain features representing protein-protein and protein-chemical interactions. We show that ensembles based on our proposed approaches achieve better predictive performance. In particular, our proposed approaches improved the performance of a Random Forest based on the most sophisticated approach for handling uncertain data in ensembles of this kind. Furthermore, we propose two new approaches for interpreting an ensemble of Naive Bayes classifiers and analyse their results on our datasets of ageing-related genes and drug's side effects.

Details

Language :
English
ISSN :
1557-9964
Volume :
20
Issue :
3
Database :
MEDLINE
Journal :
IEEE/ACM transactions on computational biology and bioinformatics
Publication Type :
Academic Journal
Accession number :
36318566
Full Text :
https://doi.org/10.1109/TCBB.2022.3218588