Back to Search Start Over

A Lightweight Submission Frontend Toolkit HepJob.

Authors :
Doglioni, C.
Kim, D.
Stewart, G.A.
Silvestris, L.
Jackson, P.
Kamleh, W.
JIANG, Xiaowei
Du, Ran
Shi, Jingyan
Zou, Jiaheng
Hu, Qingbao
Source :
EPJ Web of Conferences; 11/16/2020, Vol. 245, p1-8, 8p
Publication Year :
2020

Abstract

A typical HEP Computing Center normally runs at least one batch system. As an example, at IHEP (Institute of High Energy Physics, Chinese Academy of Sciences), we've used three batch systems: PBS, HTCondor and SLURM. After running PBS as a local batch system for 10 years, we replaced it by HTCondor (for HTC) and SLURM (for HPC). During that period, problems came up on both user and admin sides. Introduction of the new batch systems implies necessity for users to acquire additional knowledge specific for every batch system, in particular, batch commands. In some cases, users have to use both HTCondor and SLURM in parallel. Furthermore, HTCondor and SLURM provide more functionality, which means more complicated usage mode, compared to the simple PBS commands. On admin side, HTCondor gives more freedom to users, which brings an additional challenge to site administrators. Site administrators have to find the solutions for many problems: preventing users from requesting the resources they are not allowed to use, checking if the required attributes are correct, deciding where requested resources are located (SLURM cluster, the cluster of the virtual machines, the remote sites, etc). To meet the above requirements, HepJob was designed and developed. HepJob provides a set of simple user commands, for example: hep_sub, hep_q, hep_rm, etc. In the submission process, HepJob checks all the attributes and ensures all attributes are correct; assigns proper resources to users (the user and group info is obtained from the management database); routes jobs to the target site; performs other steps as required. Users can start with HepJob very easily and administrators can take the necessary management actions in HepJob. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
21016275
Volume :
245
Database :
Complementary Index
Journal :
EPJ Web of Conferences
Publication Type :
Conference
Accession number :
148681590
Full Text :
https://doi.org/10.1051/epjconf/202024503026