Back to Search Start Over

Understanding Aprun Use Patterns

Authors :
Lin, Hwa-Chun Wendy
Lin, Hwa-Chun Wendy
Lin, Hwa-Chun Wendy
Lin, Hwa-Chun Wendy
Publication Year :
2009

Abstract

On the Cray XT, aprun is the command to launch an application to a set of compute nodes reserved through the Application Level Placement Scheduler (ALPS). At the National Energy Research Scientific Computing Center (NERSC), interactive aprun is disabled. That is, invocations of aprun have to go through the batch system. Batch scripts can and often do contain several apruns which either use subsets of the reserved nodes in parallel, or use all reserved nodes in consecutive apruns. In order to better understand how NERSC users run on the XT, it is necessary to associate aprun information with jobs. It is surprisingly more challenging than it sounds. In this paper, we describe those challenges and how we solved them to produce daily per-job reports for completed apruns. We also describe additional uses of the data, e.g. adjusting charging policy accordingly or associating node failures with jobs/users, and plans for enhancements.

Details

Database :
OAIster
Notes :
application/pdf
Publication Type :
Electronic Resource
Accession number :
edsoai.on1287592741
Document Type :
Electronic Resource