Back to Search
Start Over
Quality Time: Carbon-Aware Quality Adaptation for Energy-Intensive Services
- Publication Year :
- 2024
-
Abstract
- The energy demand of modern cloud services, particularly those related to generative AI, is increasing at an unprecedented pace. While hyperscalers collectively fail to meet their self-imposed emission reduction targets, they face increasing pressure from environmental sustainability reporting across many jurisdictions. To date, carbon-aware computing strategies have primarily focused on batch process scheduling or geo-distributed load balancing. However, such approaches are not applicable to services that require constant availability at specific locations due to latency, privacy, data, or infrastructure constraints. In this paper, we explore how the carbon footprint of energy-intensive services can be reduced by adjusting the fraction of requests served by different service quality tiers. We show that adapting this quality of responses with respect to grid carbon intensity can lead to additional carbon savings beyond resource and energy efficiency. Building on this, we introduce a forecast-based multi-horizon optimization that reaches close-to-optimal carbon savings and is able to automatically adapt service quality for best-effort users to stay within an annual carbon budget. Our approach can reduce the emissions of large-scale LLM services, which we estimate at multiple 10,000 tons of CO2 annually, by up to 10%.
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2411.19058
- Document Type :
- Working Paper