Back to Search Start Over

Towards Optimal Multi-Level Checkpointing.

Authors :
Benoit, Anne
Cavelan, Aurelien
Le Fevre, Valentin
Robert, Yves
Sun, Hongyang
Source :
IEEE Transactions on Computers. Jul2017, Vol. 66 Issue 7, p1212-1226. 15p.
Publication Year :
2017

Abstract

We provide a framework to analyze multi-level checkpointing protocols, by formally defining a k<alternatives><inline-graphic xlink:href="benoit-ieq1-2643660.gif"/> </alternatives>-level checkpointing pattern. We provide a first-order approximation to the optimal checkpointing period, and show that the corresponding overhead is in the order of \sum _{\ell =1}^{k}\sqrt{2\lambda _\ell C_\ell}<alternatives> <inline-graphic xlink:href="benoit-ieq2-2643660.gif"/></alternatives>, where \lambda _\ell$<alternatives> <inline-graphic xlink:href="benoit-ieq3-2643660.gif"/></alternatives> is the error rate at level  $\ell$<alternatives> <inline-graphic xlink:href="benoit-ieq4-2643660.gif"/></alternatives>, and $C_\ell$<alternatives><inline-graphic xlink:href="benoit-ieq5-2643660.gif"/> </alternatives> the checkpointing cost at level $\ell$ <alternatives><inline-graphic xlink:href="benoit-ieq6-2643660.gif"/></alternatives>. This nicely extends the classical Young/Daly formula on single-level checkpointing. Furthermore, we are able to fully characterize the shape of the optimal pattern (number and positions of checkpoints), and we provide a dynamic programming algorithm to determine the optimal subset of levels to be used. Finally, we perform simulations to check the accuracy of the theoretical study and to confirm the optimality of the subset of levels returned by the dynamic programming algorithm. The results nicely corroborate the theoretical study, and demonstrate the usefulness of multi-level checkpointing with the optimal subset of levels. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00189340
Volume :
66
Issue :
7
Database :
Academic Search Index
Journal :
IEEE Transactions on Computers
Publication Type :
Academic Journal
Accession number :
123544223
Full Text :
https://doi.org/10.1109/TC.2016.2643660