Back to Search Start Over

Multi-level anomaly prediction in Tier-0 datacenter

Authors :
Ardebili, Mohsen Seyedkazemi
Bartolini, Andrea
Benini, Luca
Mohsen Seyedkazemi Ardebili
Andrea Bartolini
Luca Benini
Source :
CF '22: Proceedings of the 19th ACM International Conference on Computing Frontiers
Publication Year :
2022
Publisher :
ACM, 2022.

Abstract

Modern scientific discoveries are driven by an unsatisfiable demand for computational resources. To solve large problems in science, engineering, and business, data centers provide High-Performance Computing (HPC) systems with aggregation of the computing capacity of thousand of computing nodes. Anomaly prediction is critical in order to preserve the continuity of the service of HPC systems and prevent hardware deterioration. In the datacenter, a thermal anomaly occurs when the balance of cooling capacity and computational demand is disturbed. Moreover, this is identifiable from a suspicious/abnormal pattern in the monitoring signals. In this poster, the anomaly prediction task in the HPC systems is investigated by defining complex statistical rules-based and Deep Learning DL-based anomaly detection methods, then utilizing these anomaly detection methods in an anomaly prediction framework.

Details

Database :
OpenAIRE
Journal :
Proceedings of the 19th ACM International Conference on Computing Frontiers
Accession number :
edsair.doi.dedup.....ac95ad8bea2ca0b95b04395c9d659dd6
Full Text :
https://doi.org/10.1145/3528416.3530864