Back to Search
Start Over
Chronicles of Astra: Challenges and Lessons from the First Petascale Arm Supercomputer
- Source :
- SC
- Publication Year :
- 2020
- Publisher :
- IEEE, 2020.
-
Abstract
- Arm processors have been explored in HPC for several years, however there has not yet been a demonstration of viability for supporting large-scale production workloads. In this paper, we offer a retrospective on the process of bringing up Astra, the first Petascale supercomputer based on 64-bit Arm processors, and validating its ability to run production HPC applications. Through this process several immature technology gaps were addressed, including software stack enablement, Linux bugs at scale, thermal management issues, power management capabilities, and advanced container support. From this experience, several lessons learned are formulated that contributed to the successful deployment of Astra. These insights can be helpful to accelerate deploying and maturing other first-seen HPC technologies. With Astra now supporting many users running a diverse set of production applications at multi-thousand node scales, we believe this constitutes strong supporting evidence that Arm is a viable technology for even the largest-scale supercomputer deployments.
- Subjects :
- Power management
Process (engineering)
Computer science
business.industry
Node (networking)
05 social sciences
050301 education
02 engineering and technology
Supercomputer
ASTRA
ARM architecture
Petascale computing
Software deployment
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Software engineering
business
0503 education
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- SC20: International Conference for High Performance Computing, Networking, Storage and Analysis
- Accession number :
- edsair.doi...........42ac6d3f1d74b7ce6e28537aaccb0138