Back to Search Start Over

Improving the energy efficiency of data-intensive applications running on clusters.

Authors :
Liu, Weifeng
Zhou, Jie
Gong, Bin
Dai, Hongjun
Guo, Meng
Source :
International Journal of Parallel, Emergent & Distributed Systems; May2020, Vol. 35 Issue 3, p246-259, 14p
Publication Year :
2020

Abstract

As an alternative to traditional computing architecture, cloud computing now is rapidly growing. However, it is based on models like cluster computing in general. Now supercomputers are getting more and more powerful, helping scientists have more indepth understanding of the world. At the same time, clusters of commodity servers have been mainstream in the IT industry, powering not only large Internet services but also a growing number of data-intensive scientific applications, such as MPI based deep learning applications. In order to reduce the energy cost, more and more efforts are made to improve the energy consumption of HPC systems. Because I/O accesses account for a large portion of the execution time for data intensive applications, it is critical to design energy-aware parallel I/O functions for addressing challenges related to HPC energy efficiency. As the de facto standard for designing parallel applications in cluster environment, the Message Passing Interface has been widely used in high performance computing, therefore, getting the energy consumption information of MPI applications is critical for improving the energy efficiency of HPC systems. In this work we first present our energy measurement tool, a software framework that eases the energy collection in cluster environment. And then we present an approach which can optimise the parallel I/O operation's energy efficiency. The energy scheduling algorithm is evaluated in a cluster. We try to optimize the energy efficiency of BTIO. Before doing the optimization, we first modeled the execution time and energy consumption. In BTIO, a three-dimensional array is partitioned in a block-tridiagonal pattern and assigned across a square number of processes. Each process is responsible for many (the square root of the number of participating processors) subsets of the entire data set. All parameters of the elemental operations define a multidimensional configuration space. We first modeled the elemental operations. The configuration space of an elemental operation is the cross product of the individual values of the parameters. We sampled points which are evenly distributed in the configuration space, and tested the execution time and consumed energy with MEMT. We divided all the sampled points into training and test set. The majority (75%) of the points were used to iteratively train the execution time and energy models, and the left points were used for testing the models. The models of a collective I/O operation is the combination of these elemental models. The two figures show the modeled versus measured execution time and energy consumption of 250 BTIO operations. We can see that the models behave well. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
17445760
Volume :
35
Issue :
3
Database :
Complementary Index
Journal :
International Journal of Parallel, Emergent & Distributed Systems
Publication Type :
Academic Journal
Accession number :
143635850
Full Text :
https://doi.org/10.1080/17445760.2018.1455835