1. GPU Optimization for High-Quality Kinetic Fluid Simulation
- Author
-
Rui Fan, Yixin Chen, Xiaopei Liu, and Wei Li
- Subjects
FOS: Computer and information sciences ,Matching (graph theory) ,Computer science ,FOS: Physical sciences ,Kinetic energy ,Computational science ,Domain (software engineering) ,Physics::Fluid Dynamics ,Computer Science - Graphics ,Parametric statistics ,Turbulence ,Linear system ,Fluid Dynamics (physics.flu-dyn) ,Laminar flow ,Physics - Fluid Dynamics ,Solver ,Computational Physics (physics.comp-ph) ,Computer Graphics and Computer-Aided Design ,Graphics (cs.GR) ,Computer Science - Distributed, Parallel, and Cluster Computing ,Signal Processing ,Scalability ,Compressibility ,Computer Vision and Pattern Recognition ,Distributed, Parallel, and Cluster Computing (cs.DC) ,Physics - Computational Physics ,Software - Abstract
Fluid simulations are often performed using the incompressible Navier-Stokes equations (INSE), leading to sparse linear systems which are difficult to solve efficiently in parallel. Recently, kinetic methods based on the adaptive-central-moment multiple-relaxation-time (ACM-MRT) model have demonstrated impressive capabilities to simulate both laminar and turbulent flows, with quality matching or surpassing that of state-of-the-art INSE solvers. Furthermore, due to its local formulation, this method presents the opportunity for highly scalable implementations on parallel systems such as GPUs. However, an efficient ACM-MRT-based kinetic solver needs to overcome a number of computational challenges, especially when dealing with complex solids inside the fluid domain. In this paper, we present multiple novel GPU optimization techniques to efficiently implement high-quality ACM-MRT-based kinetic fluid simulations in domains containing complex solids. Our techniques include a new communication-efficient data layout, a load-balanced immersed-boundary method, a multi-kernel launch method using a simplified formulation of ACM-MRT calculations to enable greater parallelism, and the integration of these techniques into a parametric cost model to enable automated parameter search to achieve optimal execution performance. We also extended our method to multi-GPU systems to enable large-scale simulations. To demonstrate the state-of-the-art performance and high visual quality of our solver, we present extensive experimental results and comparisons to other solvers., Comment: 16 pages, 25 figures, accepted by IEEE Transactions on Visualization and Computer Graphics
- Published
- 2021
- Full Text
- View/download PDF