Back to Search Start Over

多 MapReduce 作业协同下的大数据挖掘类 算法资源效率优化.

Authors :
廖彬
张陶
于炯
黄静莱
国冰磊
刘炎
Source :
Application Research of Computers / Jisuanji Yingyong Yanjiu. May2020, Vol. 37 Issue 5, p1321-1325. 5p.
Publication Year :
2020

Abstract

Because any MapReduce job requires a series of complex operations such as task scheduling and resource allocation independently, there are a lot of redundant disk I/O and resource duplicate application operations among multiple MapReduce jobs coordinated by the same algorithm, causing inefficient resource utilization in job computing process. Big data mining algorithms are usually divided into several MapReduce Jobs, taking ItemBased algorithm as an example, this paper analyzed the resource efficiency of mining algorithm with multi-MapReduce job collaboration scenario. It proposed an ItemBased algorithm based on Distributed Cache, which used Distributed Cache to cache I/O data between multiple MapReduce Jobs, broke the defect of independence between jobs, and reduced the waiting delay between Map and Reduce tasks. The experimental results show that, DistributedCache can improve the data reading speed of MapReduce jobs. The algorithm reconstructed by Distributed Cache greatly reduces the waiting delay between Map and Reduce tasks, and improves the resource efficiency by more than three times. [ABSTRACT FROM AUTHOR]

Details

Language :
Chinese
ISSN :
10013695
Volume :
37
Issue :
5
Database :
Academic Search Index
Journal :
Application Research of Computers / Jisuanji Yingyong Yanjiu
Publication Type :
Academic Journal
Accession number :
143238095
Full Text :
https://doi.org/10.19734/j.issn.1001-3695.2018.11.079