1. CoreVA-MPSoC: A many-core architecture with tightly coupled shared and local data memories
- Author
-
Julian Daberkow, Wayne Kelly, Thorsten Jungeblut, Ulrich Rückert, Gregor Sievers, Marten Vohrmann, Martin Flasskamp, Johannes Ax, and Mario Porrmann
- Subjects
Computer science ,Memory hierarchy ,02 engineering and technology ,MPSoC ,Parallel Architectures ,080302 Computer System Architecture ,Memory architecture ,0202 electrical engineering, electronic engineering, information engineering ,Latency (engineering) ,Hardware_MEMORYSTRUCTURES ,090604 Microelectronics and Integrated Circuits ,020208 electrical & electronic engineering ,On-chip interconnection networks ,020202 computer hardware & architecture ,Multicore/single-chip multiprocessors ,080503 Networking and Communications ,Memory management ,Computational Theory and Mathematics ,Shared memory ,Computer architecture ,Hardware and Architecture ,Very long instruction word ,Signal Processing ,Benchmark (computing) - Abstract
MPSoCs with hierarchical communication infrastructures are promising architectures for low power embedded systems. Multiple CPU clusters are coupled using an Network-on-Chip (NoC). Our CoreVA-MPSoC targets streaming applications in embedded systems, like signal and video processing. In this work we introduce a tightly coupled shared data memory to each CPU cluster, which can be accessed by all CPUs of a cluster and the NoC with low latency. The main focus is the comparison of different memory architectures and their connection to the NoC. We analyze memory architectures with local data memory only, shared data memory only, and a hybrid architecture integrating both. Implementation results are presented for a 28 nm FD-SOI standard cell technology. A CPU cluster with shared memory shows similar area requirements compared to the local memory architecture. We use post place and route simulations for precise analysis of energy consumption on both cluster and NoC level using the different memory architectures. An architecture with shared data memory shows best performance results in combination with a high resource efficiency. On average, the use of shared memory shows a 17.2 percent higher throughput for a benchmark suite of 10 applications compared to the use of local memory only.
- Published
- 2018