Back to Search Start Over

Leveraging Machine Learning for Anticipatory Data Delivery in Extreme Scale In-situ Workflows

Authors :
Manish Parashar
Philip E. Davis
Pradeep Subedi
Source :
CLUSTER
Publication Year :
2019
Publisher :
IEEE, 2019.

Abstract

Extreme scale scientific workflows are composed of multiple applications that exchange data at runtime. Several data-related challenges are limiting the potential impact of such workflows. While data staging and in-situ models of execution have emerged as approaches to address data-related costs at extreme scales, increasing data volumes and complex data exchange patterns impact the effectiveness of such approaches. In this paper, we design and implement DESTINY, which is an autonomic data delivery mechanism for staging-based in-situ workflows. DESTINY dynamically learns the data access patterns of scientific workflow applications and leverages these patterns to decrease data access costs. Specifically, DESTINY uses machine learning techniques to anticipate future data accesses, proactively packages and delivers the data necessary to satisfy these requests as close to the consumer as possible and, when data staging processes and consumer processes are colocated, removes the need for inter-process communication by making these data available to the consumer as shared-memory objects. When consumer processes reside on nodes other than staging nodes, the data is packaged and stored in a format the client will likely access in future. This amortizes expensive data discovery and assembly operations typically associated with data staging. We experimentally evaluate the performance and scalability of DESTINY on leadership class platforms using synthetic applications and the S3D combustion workflow. We demonstrate that DESTINY is scalable and can achieve a reduction of up to 75% in read response time as compared to in-memory staging service for production scientific workflows.

Details

Database :
OpenAIRE
Journal :
2019 IEEE International Conference on Cluster Computing (CLUSTER)
Accession number :
edsair.doi...........bd904117de3d00199f1b8da7d3b73db0
Full Text :
https://doi.org/10.1109/cluster.2019.8891003