1. A structure-aware algorithm for fault-tolerant scheduling of scientific workflows.
- Author
-
Masoumi, Maryam and Motallebi, Hassan
- Subjects
- *
WORKFLOW , *COST control , *ALGORITHMS , *SCHEDULING , *PETRI nets , *FAULT tolerance (Engineering) - Abstract
Here, we propose a fault-tolerant workflow scheduling algorithm that combines basic redundancies to reduce execution time through minimizing the redundancy overhead. We propose a graph-theory-based divide and conquer approach for selecting fault-tolerance strategies for workflow tasks. The appropriate strategy for each task is determined with respect to runtime situation and the position of the task in the graph. The main idea of the proposed algorithm is that resources are apportioned among concurrently executing tasks such that more replicas are assigned to tasks that benefit more from having extra replicas. We use the concept of concurrency graph for finding idle durations of resources which are used for processing additional task replicas. We also propose an opportunistic method for executing extra replicas of tasks in situations that some resources become idle. Furthermore, we propose a new mapping order scheme for ordering task replicas on resources. The proposed approach achieves a significant performance improvement over the existing approaches especially in situations where few resources are enrolled with the aim of cost reduction. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF