1. Extension of the INFN Tier-1 on a HPC system
- Author
-
Diego Ciangottini, L. dell'Agnello, Tommaso Boccali, S Zani, Stefano Dal Pra, Andrea Valassi, Daniele Spiga, Daniele Bonacorsi, Alessandra Doria, Alessandro De Salvo, Francesco Noferini, Federico Stagni, and C. Bozzi
- Subjects
Cloud resources ,QC1-999 ,Other Fields of Physics ,FOS: Physical sciences ,computer.software_genre ,01 natural sciences ,7. Clean energy ,High Energy Physics - Experiment ,High Energy Physics - Experiment (hep-ex) ,0103 physical sciences ,010306 general physics ,computer.programming_language ,Large Hadron Collider ,hep-ex ,010308 nuclear & particles physics ,Physics ,Extension (predicate logic) ,Computational Physics (physics.comp-ph) ,Software distribution ,Grid ,Partition (database) ,Tier 1 network ,physics.comp-ph ,Operating system ,Alice (programming language) ,computer ,Physics - Computational Physics ,Particle Physics - Experiment - Abstract
The INFN Tier-1 located at CNAF in Bologna (Italy) is a center of the WLCG e-Infrastructure, supporting the 4 major LHC collaborations and more than 30 other INFN-related experiments. After multiple tests towards elastic expansion of CNAF compute power via Cloud resources (provided by Azure, Aruba and in the framework of the HNSciCloud project), and building on the experience gained with the production quality extension of the Tier-1 farm on remote owned sites, the CNAF team, in collaboration with experts from the ALICE, ATLAS, CMS, and LHCb experiments, has been working to put in production a solution of an integrated HTC+HPC system with the PRACE CINECA center, located nearby Bologna. Such extension will be implemented on the Marconi A2 partition, equipped with Intel Knights Landing (KNL) processors. A number of technical challenges were faced and solved in order to successfully run on low RAM nodes, as well as to overcome the closed environment (network, access, software distribution, ... ) that HPC systems deploy with respect to standard GRID sites. We show preliminary results from a large scale integration effort, using resources secured via the successful PRACE grant N. 2018194658, for 30 million KNL core hours., 13 pages
- Published
- 2020