Back to Search
Start Over
HPCGCN: A Predictive Framework on High Performance Computing Cluster Log Data Using Graph Convolutional Networks.
- Source :
-
Proceedings : ... IEEE International Conference on Big Data. IEEE International Conference on Big Data [Proc IEEE Int Conf Big Data] 2021 Dec; Vol. 2021, pp. 4113-4118. Date of Electronic Publication: 2022 Jan 13. - Publication Year :
- 2021
-
Abstract
- This paper presents a novel use case of Graph Convolutional Network (GCN) learning representations for predictive data mining, specifically from user/task data in the domain of high-performance computing (HPC). It outlines an approach based on a coalesced data set: logs from the Slurm workload manager, joined with user experience survey data from computational cluster users. We introduce a new method of constructing a heterogeneous unweighted HPC graph consisting of multiple typed nodes after revealing the manifold relations between the nodes. The GCN structure used here supports two tasks: i) determining whether a job will complete or fail and ii) predicting memory and CPU requirements by training the GCN semi-supervised classification model and regression models on the generated graph. The graph is partitioned into partitions using graph clustering. We conducted classification and regression experiments using the proposed framework on our HPC log dataset and evaluated predictions by our trained models against baselines using test_score, F1-score, precision, recall for classification, and R1 score for regression, showing that our framework achieves significant improvements.
Details
- Language :
- English
- Volume :
- 2021
- Database :
- MEDLINE
- Journal :
- Proceedings : ... IEEE International Conference on Big Data. IEEE International Conference on Big Data
- Publication Type :
- Academic Journal
- Accession number :
- 36745144
- Full Text :
- https://doi.org/10.1109/bigdata52589.2021.9671370