Back to Search
Start Over
多 MapReduce 作业协同下的大数据挖掘类 算法资源效率优化.
- Source :
-
Application Research of Computers / Jisuanji Yingyong Yanjiu . May2020, Vol. 37 Issue 5, p1321-1325. 5p. - Publication Year :
- 2020
-
Abstract
- Because any MapReduce job requires a series of complex operations such as task scheduling and resource allocation independently, there are a lot of redundant disk I/O and resource duplicate application operations among multiple MapReduce jobs coordinated by the same algorithm, causing inefficient resource utilization in job computing process. Big data mining algorithms are usually divided into several MapReduce Jobs, taking ItemBased algorithm as an example, this paper analyzed the resource efficiency of mining algorithm with multi-MapReduce job collaboration scenario. It proposed an ItemBased algorithm based on Distributed Cache, which used Distributed Cache to cache I/O data between multiple MapReduce Jobs, broke the defect of independence between jobs, and reduced the waiting delay between Map and Reduce tasks. The experimental results show that, DistributedCache can improve the data reading speed of MapReduce jobs. The algorithm reconstructed by Distributed Cache greatly reduces the waiting delay between Map and Reduce tasks, and improves the resource efficiency by more than three times. [ABSTRACT FROM AUTHOR]
Details
- Language :
- Chinese
- ISSN :
- 10013695
- Volume :
- 37
- Issue :
- 5
- Database :
- Academic Search Index
- Journal :
- Application Research of Computers / Jisuanji Yingyong Yanjiu
- Publication Type :
- Academic Journal
- Accession number :
- 143238095
- Full Text :
- https://doi.org/10.19734/j.issn.1001-3695.2018.11.079