Back to Search
Start Over
An improved ensemble learning machine for biological activity prediction of tyrosine kinase inhibitors.
- Source :
-
Journal of Chemometrics . Apr2015, Vol. 29 Issue 4, p213-223. 11p. - Publication Year :
- 2015
-
Abstract
- Boosting is one of the most important strategies in ensemble learning because of its ability to improve the stability and performance of weak learners. It is nonparametric, multivariate, fast and interpretable but is not robust against outliers. To enhance its prediction accuracy as well as immunize it against outliers, a modified version of a boosting algorithm (AdaBoost R2) was developed and called AdaBoost R3. In the sampling step, extremum samples were added to the boosting set. In the robustness step, a modified Huber loss function was applied to overcome the outlier problem. In the output step, a deterministic threshold was used to guarantee that bad predictions do not participate in the final output. The performance of the modified algorithm was investigated with two anticancer data sets of tyrosine kinase inhibitors, and the mechanism of inhibition was studied using the relative weighted variable importance procedure. Investigating the effect of base learner's strength reveals that boosting is only successful using the classification and regression tree method (a weak to moderate learner) and does not have a significant effect using the radial basis functions partial least square method (a strong base learners). Copyright © 2015 John Wiley & Sons, Ltd. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 08869383
- Volume :
- 29
- Issue :
- 4
- Database :
- Academic Search Index
- Journal :
- Journal of Chemometrics
- Publication Type :
- Academic Journal
- Accession number :
- 101965617
- Full Text :
- https://doi.org/10.1002/cem.2698