Back to Search
Start Over
Memory latency optimizations for the elementary functions on the Sunway architecture
- Source :
- The Journal of Supercomputing. 75:3917-3944
- Publication Year :
- 2019
- Publisher :
- Springer Science and Business Media LLC, 2019.
-
Abstract
- As fundamental software of high-performance computers, elementary functions have a significant impact on the performance of the high-level applications. Benefiting from the Chinese-designed manycore system consisting of processing cores and auxiliary cores, the Sunway TaihuLight supercomputer is considered as one of the fastest supercomputers in the world, having ranked on the top of the TOP500 supercomputer list several times. The processing cores of the Sunway architecture are coupled using a shared memory strategy, leading to high latency of memory accesses and performance degradation of the elementary functions where a variety of memory accesses exist. To address this issue, we propose a set of optimizations for memory latency of the Sunway processing cores. Firstly, we obtain a reduced data table in the context of guaranteed accuracy by optimizing underlying algorithms, grouping and mapping, removing error compensations, etc. Secondly, we perform data movement from the global memory shared by all processing cores to the scratchpad memory of individual processing cores, significantly reducing the memory latency. Finally, we convert the memory accesses that cannot be localized due to the limited space of the scratchpad memory into equivalent immediate loads and/or shift operators, further improving the performance. In addition, we automate the algorithm by carefully selecting the most suitable data conversion approach and table-lookup algorithm, mitigating the code explosion issue effectively. We implement our method and evaluate the effectiveness of the optimizations by conducting experiments on the Sunway architecture. The experimental results show that exponential functions can achieve performance improvements by 91 and 86.2% from the data movement and data conversion strategies.
- Subjects :
- 020203 distributed computing
TOP500
Computer science
02 engineering and technology
computer.file_format
Parallel computing
Supercomputer
CAS latency
Theoretical Computer Science
Data conversion
Shared memory
Hardware and Architecture
0202 electrical engineering, electronic engineering, information engineering
Latency (engineering)
computer
Software
Information Systems
Scratchpad memory
Sunway TaihuLight
Subjects
Details
- ISSN :
- 15730484 and 09208542
- Volume :
- 75
- Database :
- OpenAIRE
- Journal :
- The Journal of Supercomputing
- Accession number :
- edsair.doi...........ebc0b39cd7449656558ce2a52cb60177