Author: "Andrey Gorobets" / Publisher: elsevier bv - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Andrey Gorobets"' showing total 7 results

Start Over Author "Andrey Gorobets" Publisher elsevier bv

7 results on '"Andrey Gorobets"'

1. An Efficient Eigenvalue Bounding Method: Cfl Condition Revisited

Author: F. Xavier Trias, Xavier Álvarez-Farré, Àdel Alsalti-Baldellou, Andrey Gorobets, and Assensi Oliva
Published: 2023
Full Text: View/download PDF

2. Heterogeneous CPU+GPU parallelization for high-accuracy scale-resolving simulations of compressible turbulent flows on hybrid supercomputers

Author: Pavel Alexeevisch Bakhvalov and Andrey Gorobets
Subjects: Computer science, Computation, Numerical analysis, Parallel algorithm, General Physics and Astronomy, Parallel computing, Computer Science::Performance, Stream processing, Hardware and Architecture, Computer Science::Mathematical Software, Polygon mesh, Enhanced Data Rates for GSM Evolution, General-purpose computing on graphics processing units, Computer Science::Distributed, Parallel, and Cluster Computing, Xeon Phi
Abstract: A heterogeneous parallel algorithm for simulation of compressible turbulent flows and its portable software implementation are presented. The underlying numerical method is based on a family of higher accuracy edge-based reconstruction schemes on unstructured mixed-element meshes. The proposed parallel solution can engage a large number of computing devices of most of the existing computing architectures used in modern supercomputers, including manycore CPUs and GPUs. It is capable of co-execution on both CPUs and accelerators simultaneously. The multilevel parallel algorithm combines: MPI for distributing workload among hybrid cluster nodes and between devices inside nodes; OpenMP for manycore CPUs and other supporting devices, such as Intel Xeon Phi; OpenCL for massively-parallel accelerators, such as GPUs of various vendors, including NVIDIA, AMD, Intel. The main focus is on the adaptation of the numerical method and its computational algorithm to the stream processing parallel paradigm. The very limited device memory inherent in GPU computing is also taken into account. A detailed description of the parallel algorithm is presented, as well as the techniques used for its efficient parallel implementation. Special attention is paid to implicit time integration with its linear solver and calculation of convective fluxes and viscous terms. The use of mixed floating-point precision and overlapping communications and computations is also discussed. Parallel performance is demonstrated in practical applications on different kinds of supercomputers using up to 10 thousand cores and multiple GPUs of comparable overall performance.
Published: 2022
Full Text: View/download PDF

3. A hierarchical parallel implementation for heterogeneous computing. Application to algebra-based CFD simulations on hybrid supercomputers

Author: F. Xavier Trias, Xavier Álvarez-Farré, Andrey Gorobets, Universitat Politècnica de Catalunya. Doctorat en Enginyeria Tèrmica, Universitat Politècnica de Catalunya. Departament de Màquines i Motors Tèrmics, and Universitat Politècnica de Catalunya. CTTC - Centre Tecnològic de la Transferència de Calor
Subjects: General Computer Science, MPI+OpenMP+OpenCL, Computer science, CUDA, Multiprocessing, Symmetric multiprocessor system, Parallel CFD, Computational fluid dynamics, 01 natural sciences, 010305 fluids & plasmas, Software portability, Supercomputadors, SpMV, 0103 physical sciences, Overhead (computing), 0101 mathematics, Hybrid supercomputer, General Engineering, Dot product, Dinàmica de fluids computacional, Supercomputers, Supercomputer, Data structure, 010101 applied mathematics, Algebra, CPU+GPU, Heterogeneous computing, Enginyeria mecànica::Mecànica de fluids [Àrees temàtiques de la UPC]
Abstract: The quest for new portable implementations of simulation algorithms is motivated by the increasing variety of computing architectures. Moreover, the hybridization of high-performance computing systems imposes additional constraints, since heterogeneous computations are needed to efficiently engage processors and massively-parallel accelerators. This, in turn, involves different parallel paradigms and computing frameworks and requires complex data exchanges between computing units. Typically, simulation codes rely on sophisticated data structures and computing subroutines, so-called kernels, which makes portability terribly cumbersome. Thus, a natural way to achieve portability is to dramatically reduce the complexity of both data structures and computing kernels. In our algebra-based approach, the scale-resolving simulation of incompressible turbulent flows on unstructured meshes relies on three fundamental kernels: the sparse matrix-vector product, the linear combination of vectors and the dot product. It is noteworthy that this approach is not limited to a particular kind of numerical method or a set of governing equations. In our code, an auto-balanced multilevel partitioning distributes workload among computing devices of various architectures. The overlap of computations and multistage communications efficiently hides the data exchanges overhead in large-scale supercomputer simulations. In addition to computing on accelerators, special attention is paid at efficiency on manycore processors in multiprocessor nodes with significant non-uniform memory access factor. Parallel efficiency and performance are studied in detail for different execution modes on various supercomputers using up to 9,600 processor cores and up to 256 graphics processor units. The heterogeneous implementation model described in this work is a general-purpose approach that is well suited for various subroutines in numerical simulation codes. The work of A. G. has been funded by the Russian Sci- ence Foundation, project 19-11-00299. The work of X. Á. F. and F. X. T. has been financially supported by the ANUMESOL project (ENE2017-88697-R) by the Spanish Research Agency, and the FusionCAT project (0 01-P-0 01722) by the Government of Catalo- nia RIS3CAT FEDER. X. Á. F. is supported by a predoctoral contract (2019FI_B2-0 0 076) by the Government of Catalonia. The work has been carried out using the MareNostrum 4 supercomputer of the Barcelona Supercomputing Center; the TSUBAME3.0 supercom-puter of the Global Scientific Information and Computing Center at Tokyo Institute of Technology; the Lomonosov-2 supercomputer of the shared research facilities of HPC computing resources at Lomonosov Moscow State University; the K-60 hybrid cluster of the Collective Usage Centre of KIAM RAS. The authors thankfully acknowledge these institutions.
Published: 2021
Full Text: View/download PDF

4. Direct numerical simulation of a fully developed turbulent square duct flow up to Reτ=1200

Author: Andrey Gorobets, F. Xavier Trias, Hao Zhang, Assensi Oliva, and Yuanqiang Tan
Subjects: Fluid Flow and Transfer Processes, Physics, Computer simulation, Meteorology, Turbulence, Mechanical Engineering, Direct numerical simulation, Laminar sublayer, Reynolds number, Mechanics, Condensed Matter Physics, symbols.namesake, symbols, Mean flow, Duct (flow), Large eddy simulation
Abstract: Various fundamental studies based on a turbulent duct flow have gained popularity including heat transfer, magnetohydrodynamics as well as particle-laden transportation. An accurate prediction on the turbulent flow field is critical for these researches. However, the database of the mean flow and turbulence statistics is fairly insufficient due to the enormous cost of numerical simulation at high Reynolds number. This paper aims at providing available information by conducting several Direct Numerical Simulations (DNS) on turbulent duct flows at Re τ = 300 , 600 , 900 and 1200 . A quantitative comparison between current and previous DNS results was performed where a good agreement was achieved at Re τ = 300 . However, further comparisons of the present results with the previous DNS results at Re τ = 600 obtained with much coarser meshes revealed some discrepancies which can be explained by the insufficient mesh resolution. At last, the mean flow and turbulent statistics at higher Re τ was presented and the effect of Re τ on the mean flow and flow dynamics was discussed.
Published: 2015
Full Text: View/download PDF

5. An OpenCL-based Parallel CFD Code for Simulations on Hybrid Systems with Massively-parallel Accelerators

Author: F. Xavier Trias, Assensi Oliva, Andrey Gorobets, Universitat Politècnica de Catalunya. Departament de Màquines i Motors Tèrmics, and Universitat Politècnica de Catalunya. CTTC - Centre Tecnològic de la Transferència de Calor
Subjects: Structured mesh, Computer science, GPU, Parallel CFD, Parallel computing, Computational fluid dynamics, structured mesh, Computational science, Algorithmic skeleton, Computer Science::Operating Systems, Massively parallel, Engineering(all), Computer Science::Distributed, Parallel, and Cluster Computing, Finite-volume, Multi-core processor, OpenCL, OpenMP, Dinàmica de fluids computacional, General Medicine, Supercomputer, Computer Science::Performance, Hybrid system, Computer Science::Mathematical Software, MPI, Node (circuits), Distributed memory, Xeon Phi, Enginyeria mecànica::Mecànica de fluids [Àrees temàtiques de la UPC], finite-volume
Abstract: A parallel finite-volume CFD algorithm for modeling of incompressible flows on hybrid supercomputers is presented. It is based on a symmetry-preserving high-order numerical scheme for structured meshes. A multilevel approach that combines di erent parallel models is used for large-scale simulations on computing systems with massively-parallel accelerators. MPI is used on the first level within the distributed memory model to couple computing nodes of a supercomputer. On the second level OpenMP is used to engage multiple CPU cores of a computing node. The third level exploits the computing potential of massively-parallel accelerators such as GPU (Graphics Processing Units) of AMD and NVIDIA, or Intel Xeon Phi accelerators of the MIC (Many Integrated Core) architecture. The hardware independent OpenCL standard is used to compute on accelerators of di erent architectures within a general model for a combination of a central processor and a math co-processor.
Published: 2013
Full Text: View/download PDF

6. OpenCL Implementation of Basic Operations for a High-order Finite-volume Polynomial Scheme on Unstructured Hybrid Meshes

Author: S. A. Soukov, Andrey Gorobets, and P. B. Bogdanov
Subjects: Scheme (programming language), Polynomial, Finite volume method, OpenCL, Computer science, GPU, Byte, OpenMP, Memory bandwidth, Parallel CFD, General Medicine, Parallel computing, FLOPS, Computational science, unstructured mesh, Computer Science::Mathematical Software, MPI, Polygon mesh, Implementation, computer, Engineering(all), finite-volume, computer.programming_language
Abstract: A parallel finite-volume algorithm based on a cell-centered high-order polynomial scheme for unstructured hybrid meshes is under consideration. The work is focused on the adaptation and optimization of basic operations of the algorithm to different architec- tures of massively-parallel accelerators including GPU of AMD and NVIDIA. Such an algorithm is especially problematic for the GPU architectures since it has very low FLOP per byte ratio meaning that performance is dominated by the memory bandwidth but not the computing performance of a device. At the same time it has irregular memory access pattern since unstructured meshes are used. The calculation of polynomial coefficients and the calculation of convective fluxes through faces of cells are the most interesting and time consuming operations of the algorithm. Implementations of these operations for accelerators using OpenCL are considered here in detail. The ways to improve the computational efficiency are proposed, performance measurement results reaching up to 160 GFLOPS on a single GPU device are demonstrated.
Published: 2013
Full Text: View/download PDF

7. Direct Numerical Simulation of Incompressible Flows on Unstructured Meshes Using Hybrid CPU/GPU Supercomputers

Author: Assensi Oliva, Guillermo Oyarzun, Oriol Lehmkuhl, Andrey Gorobets, and R. Borrell
Subjects: Computer science, business.industry, CPU/GPU hybrid supercomputers, Computation, Direct numerical simulation, General Medicine, Parallel computing, Software_PROGRAMMINGTECHNIQUES, Computational fluid dynamics, Computational science, CUDA, Scalability, Code (cryptography), direct numerical simulation, MPI, Polygon mesh, Navier-Stokes equations, Navier–Stokes equations, business, Engineering(all), ComputingMethodologies_COMPUTERGRAPHICS
Abstract: This paper describes a hybrid MPI-CUDA parallelization strategy for the direct numerical simulation of incompressible flows using unstructured meshes. Our in-house MPI-based unstructured CFD code has been extended in order to increase its performance by means of GPU co-processors. Therefore, the main goal of this work is to take advantage of the current hybrid supercomputers to increase our computing capabilities. CUDA is used to perform the calculations on the GPU devices and MPI to handle the communications between them. The main drawback for the performance is the slowdown produced by the MPI communication episodes. Consequently, overlapping strategies, to hide MPI communication costs under GPU computations, are studied in detail with the aim to achieve scalability when executing the code on multiple nodes.
Published: 2013
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

7 results on '"Andrey Gorobets"'

1. An Efficient Eigenvalue Bounding Method: Cfl Condition Revisited

2. Heterogeneous CPU+GPU parallelization for high-accuracy scale-resolving simulations of compressible turbulent flows on hybrid supercomputers

3. A hierarchical parallel implementation for heterogeneous computing. Application to algebra-based CFD simulations on hybrid supercomputers

4. Direct numerical simulation of a fully developed turbulent square duct flow up to Reτ=1200

5. An OpenCL-based Parallel CFD Code for Simulations on Hybrid Systems with Massively-parallel Accelerators

6. OpenCL Implementation of Basic Operations for a High-order Finite-volume Polynomial Scheme on Unstructured Hybrid Meshes

7. Direct Numerical Simulation of Incompressible Flows on Unstructured Meshes Using Hybrid CPU/GPU Supercomputers

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

7 results on '"Andrey Gorobets"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources