1. A Data-Centric Approach to Extreme-Scale Ab initio Dissipative Quantum Transport Simulations
- Author
-
Torsten Hoefler, Timo Schneider, Guillermo Indalecio Fernández, Mathieu Luisier, Alexandros Nikolaos Ziogas, and Tal Ben-Nun
- Subjects
FOS: Computer and information sciences ,020203 distributed computing ,Dataflow ,Computer science ,Ab initio ,Double-precision floating-point format ,02 engineering and technology ,Solver ,01 natural sciences ,7. Clean energy ,Database-centric architecture ,Computational science ,Computational Engineering, Finance, and Science (cs.CE) ,Computer Science - Distributed, Parallel, and Cluster Computing ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Code (cryptography) ,Dissipative system ,Distributed, Parallel, and Cluster Computing (cs.DC) ,Computer Science - Computational Engineering, Finance, and Science ,010306 general physics ,Order of magnitude ,Massively parallel and high-performance simulations ,Parallel computing methodologies ,Quantum mechanic simulation - Abstract
The computational efficiency of a state of the art ab initio quantum transport (QT) solver, capable of revealing the coupled electro-thermal properties of atomically-resolved nano-transistors, has been improved by up to two orders of magnitude through a data centric reorganization of the application. The approach yields coarse-and fine-grained data-movement characteristics that can be used for performance and communication modeling, communication-avoidance, and dataflow transformations. The resulting code has been tuned for two top-6 hybrid supercomputers, reaching a sustained performance of 85.45 Pflop/s on 4,560 nodes of Summit (42.55% of the peak) in double precision, and 90.89 Pflop/s in mixed precision. These computational achievements enable the restructured QT simulator to treat realistic nanoelectronic devices made of more than 10,000 atoms within a 14$\times$ shorter duration than the original code needs to handle a system with 1,000 atoms, on the same number of CPUs/GPUs and with the same physical accuracy., Comment: 13 pages, 13 figures, SC19
- Published
- 2019