1. CosmoFlow: Using Deep Learning to Learn the Universe at Scale
- Author
-
Mathuriya, Amrita, Bard, Deborah, Mendygral, Peter, Meadows, Lawrence, Arnemann, James, Shao, Lei, He, Siyu, Kärnä, Tuomas, Moise, Diana, Pennycook, Simon J, Maschhoff, Kristyn, Sewall, Jason, Kumar, Nalini, Ho, Shirley, Ringenburg, Michael F, Prabhat, and Lee, Victor
- Subjects
Information and Computing Sciences ,Physical Sciences ,Machine Learning ,Networking and Information Technology R&D (NITRD) ,Data Science ,Machine Learning and Artificial Intelligence ,Cosmology ,Deep Learning ,TensorFlow ,High Performance Computing ,astro-ph.CO ,astro-ph.IM ,cs.LG ,physics.comp-ph ,Deep learning ,maching learning ,tensorflow ,high performance computing - Abstract
Deep learning is a promising tool to determine the physical model that describes our universe. To handle the considerable computational cost of this problem, we present CosmoFlow: a highly scalable deep learning application built on top of the TensorFlow framework. CosmoFlow uses efficient implementations of 3D convolution and pooling primitives, together with improvements in threading for many element-wise operations, to improve training performance on Intel® Xeon Phi™ processors. We also utilize the Cray PE Machine Learning Plugin for efficient scaling to multiple nodes. We demonstrate fully synchronous data-parallel training on 8192 nodes of Cori with 77% parallel efficiency, achieving 3.5 Pflop/s sustained performance. To our knowledge, this is the first large-scale science application of the TensorFlow framework at supercomputer scale with fully-synchronous training. These enhancements enable us to process large 3D dark matter distribution and predict the cosmological parameters ΩsubM/sub, σsub8/sub and nsubs/sub with unprecedented accuracy.
- Published
- 2018