Back to Search Start Over

X10-enabled MapReduce

Authors :
Han Dong
David Grove
Shujia Zhou
Source :
PGAS
Publication Year :
2010
Publisher :
ACM, 2010.

Abstract

The MapReduce framework has become a popular and powerful tool to process large datasets in parallel over a cluster of computing nodes [1]. Currently, there are many flavors of implementations of MapReduce, among which the most popular is the Hadoop implementation in Java [5]. However, these implementations either rely on third-party file systems for across-computer-node communication or are difficult to implement with socket programming or communication libraries such as MPI. To address these challenges, we investigated utilizing the X10 language to implement MapReduce and tested it with the word-count use case. The key performance factor in implementing MapReduce is data moving across different computer nodes. Since X10 has built-in functions for across-node communication such as distributed arrays [2], a major challenge with MapReduce implementations is easily solved. We tested two main implementations: the first utilizes the HashMap data structure and the second a Rail with elements consisting of a string and integer pair. The performance of these two implementations are analyzed and discussed.

Details

Database :
OpenAIRE
Journal :
Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
Accession number :
edsair.doi...........c61d11a89be311c12adec17ea6f4be16