Back to Search Start Over

Greening AI: A framework for energy-aware resource allocation of ML training jobs with performance guarantees

Authors :
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
Barcelona Supercomputing Center
Sala, Roberto
Filippini, Federica
Ardagna, Danilo
Lezzi, Daniele
Lordan Gomis, Francesc-Josep
Thiem, Patrick
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
Barcelona Supercomputing Center
Sala, Roberto
Filippini, Federica
Ardagna, Danilo
Lezzi, Daniele
Lordan Gomis, Francesc-Josep
Thiem, Patrick
Publication Year :
2024

Abstract

The rapid expansion of Machine Learning (ML) and Artificial Intelligence (AI) has profoundly influenced the technological landscape, reshaping various industries and applications. This surge in computational demands has led to the widespread adoption of Cloud data centers, crucial for supporting the storage and processing requirements of these advanced technologies. However, this expansion poses significant challenges, particularly in terms of energy consumption and associated carbon emissions. As the reliance on cloud data centers intensifies, there is a growing concern about the environmental impact, necessitating innovative solutions to enhance energy efficiency and reduce the ecological footprint of these computational infrastructures. This paper focuses on addressing the challenges linked to training ML and AI applications, emphasizing the importance of energy-efficient solutions. The proposed framework integrates components from the AI-SPRINT project toolchain, such as Krake, Space4AI-R, and PyCOMPSs. Our reference application involves training a Random Forest model for electrocardiogram classification, profiling available resources to obtain a performance model able to predict the training time, and dynamically migrating the workload to sites with cleaner energy sources providing guarantees on the training process due date. Results demonstrate the framework’s capacity to estimate execution time and resource requirements with low error, highlighting its potential for establishing an environmentally sustainable AI ecosystem.<br />This work has been funded by the European Commission under the H2020 grant N. 101016577 AI-SPRINT: AI in Secure Privacy pReserving computINg conTinuum.<br />Peer Reviewed<br />Postprint (author's final draft)

Details

Database :
OAIster
Notes :
12 p., application/pdf, English
Publication Type :
Electronic Resource
Accession number :
edsoai.on1452496597
Document Type :
Electronic Resource