Back to Search Start Over

BoostVHT : Boosting distributed streaming decision trees

Authors :
Vasiloudis, Theodore
Beligianni, Foteini
De Francisci Morales, G.
Vasiloudis, Theodore
Beligianni, Foteini
De Francisci Morales, G.
Publication Year :
2017

Abstract

Online boosting improves the accuracy of classifiers for unbounded streams of data by chaining them into an ensemble. Due to its sequential nature, boosting has proven hard to parallelize, even more so in the online setting. This paper introduces BoostVHT, a technique to parallelize online boosting algorithms. Our proposal leverages a recently-developed model-parallel learning algorithm for streaming decision trees as a base learner. This design allows to neatly separate the model boosting from its training. As a result, BoostVHT provides a flexible learning framework which can employ any existing online boosting algorithm, while at the same time it can leverage the computing power of modern parallel and distributed cluster environments. We implement our technique on Apache SAMOA, an open-source platform for mining big data streams that can be run on several distributed execution engines, and demonstrate order of magnitude speedups compared to the state-of-the-art.<br />QC 20180503

Details

Database :
OAIster
Notes :
English
Publication Type :
Electronic Resource
Accession number :
edsoai.on1234945781
Document Type :
Electronic Resource
Full Text :
https://doi.org/10.1145.3132847.3132974