Back to Search
Start Over
A Framework for Memory Oversubscription Management in Graphics Processing Units
- Source :
- ASPLOS
- Publication Year :
- 2019
- Publisher :
- ACM, 2019.
-
Abstract
- Modern discrete GPUs support unified memory and demand paging. Automatic management of data movement between CPU memory and GPU memory dramatically reduces developer effort. However, when application working sets exceed physical memory capacity, the resulting data movement can cause great performance loss. This paper proposes a memory management framework, called ETC, that transparently improves GPU performance under memory oversubscription using new techniques to overlap eviction latency of GPU pages, reduce thrashing cost, and increase effective memory capacity. Eviction latency can be hidden by eagerly creating space for demand-paged data with proactive eviction (E). Thrashing costs can be ameliorated with memory-aware throttling (T), which dynamically reduces \reviithe GPU parallelism when page fault frequencies become high. Capacity compression (C) can enable larger working sets without increasing physical memory capacity. No single technique fits all workloads, and, thus, ETC integrates proactive eviction, memory-aware throttling and capacity compression into a principled framework that dynamically selects the most effective combination of techniques, transparently to the running software. To this end, ETC categorizes applications into three categories: regular applications without data sharing across kernels, regular applications with data sharing across kernels, and irregular applications. Our evaluation shows that ETC fully mitigates the oversubscription overhead for regular applications without data sharing and delivers performance similar to the ideal unlimited GPU memory baseline. We also show that ETC outperforms the state-of-the-art baseline by 60.4% and 270% for regular applications with data sharing and irregular applications, respectively.
- Subjects :
- 010302 applied physics
Hardware_MEMORYSTRUCTURES
Page fault
Computer science
business.industry
Distributed computing
Thrashing
02 engineering and technology
Bandwidth throttling
01 natural sciences
020202 computer hardware & architecture
Data sharing
Software
Demand paging
0103 physical sciences
0202 electrical engineering, electronic engineering, information engineering
Latency (engineering)
Graphics
business
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems
- Accession number :
- edsair.doi...........85a83b2295f7c9b56b26aa791f925a7b