Back to Search Start Over

BLB-gcForest: A High-Performance Distributed Deep Forest With Adaptive Sub-Forest Splitting.

Authors :
Chen, Zexi
Wang, Ting
Cai, Haibin
Mondal, Subrota Kumar
Sahoo, Jyoti Prakash
Source :
IEEE Transactions on Parallel & Distributed Systems; No2022, Vol. 33 Issue 11, p3141-3152, 12p
Publication Year :
2022

Abstract

As an emulous alternative to deep neural networks, Deep Forest emerges with features like low complexity, fewer hyper-parameters, and good robustness, which are predominantly desired in distributed computing applications and ecosystems. Recently, an efficient distributed Deep Forest system, named ForestLayer, was proposed, designing a fine-grained sub-Forest-based task-parallel algorithm to improve the parallel computing efficiency of Deep Forest. However, the sub-Forest splitting of ForestLayer is static and one-off without adaptability to the computing environment, nevertheless, the size of splitting granularity has a significant impact on the system performance. To further improve the computing efficiency and scalability of the distributed Deep Forest, in this paper, we propose a novel distributed Deep Forest algorithm, named BLB-gcForest (Bag of Little Bootstraps-gcForest), which augments the gcForest (multi-Grained Cascade Forest) approach for constructing Deep Forest. BLB-gcForest carries out parallel computation for each tree in sub-Forests at a finer parallel granularity and integrates with the Bag of Little Bootstraps (BLB) mechanism to reduce massive transmitted feature instances for Cascade Forest Layers, utterly improving both computation efficiency and communication efficiency. Moreover, to solve the problem of the forest splitting granularity, we further design an adaptive sub-Forest splitting algorithm to ensure the maximum resource utilization for parallel computation of each sub-Forest. Experimental results on four well-known large-scale datasets, namely YEAST, LETTER, MNIST, CIFAR10, show that the training efficiency of BLB-gcForest achieves up to 20.3x and 1.64x speedups compared with the state-of-the-art gcForest and ForestLayer, respectively while guaranteeing higher accuracy and better robustness [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10459219
Volume :
33
Issue :
11
Database :
Complementary Index
Journal :
IEEE Transactions on Parallel & Distributed Systems
Publication Type :
Academic Journal
Accession number :
157073365
Full Text :
https://doi.org/10.1109/TPDS.2021.3133544