Back to Search Start Over

A Framework for Memory Oversubscription Management in Graphics Processing Units

Authors :
Onur Mutlu
Youtao Zhang
Christopher J. Rossbach
Rachata Ausavarungnirun
Yang Guo
Jun Yang
Chen Li
Source :
ASPLOS
Publication Year :
2019
Publisher :
ACM, 2019.

Abstract

Modern discrete GPUs support unified memory and demand paging. Automatic management of data movement between CPU memory and GPU memory dramatically reduces developer effort. However, when application working sets exceed physical memory capacity, the resulting data movement can cause great performance loss. This paper proposes a memory management framework, called ETC, that transparently improves GPU performance under memory oversubscription using new techniques to overlap eviction latency of GPU pages, reduce thrashing cost, and increase effective memory capacity. Eviction latency can be hidden by eagerly creating space for demand-paged data with proactive eviction (E). Thrashing costs can be ameliorated with memory-aware throttling (T), which dynamically reduces \reviithe GPU parallelism when page fault frequencies become high. Capacity compression (C) can enable larger working sets without increasing physical memory capacity. No single technique fits all workloads, and, thus, ETC integrates proactive eviction, memory-aware throttling and capacity compression into a principled framework that dynamically selects the most effective combination of techniques, transparently to the running software. To this end, ETC categorizes applications into three categories: regular applications without data sharing across kernels, regular applications with data sharing across kernels, and irregular applications. Our evaluation shows that ETC fully mitigates the oversubscription overhead for regular applications without data sharing and delivers performance similar to the ideal unlimited GPU memory baseline. We also show that ETC outperforms the state-of-the-art baseline by 60.4% and 270% for regular applications with data sharing and irregular applications, respectively.

Details

Database :
OpenAIRE
Journal :
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems
Accession number :
edsair.doi...........85a83b2295f7c9b56b26aa791f925a7b