Author: "Christophe Calvin" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Christophe Calvin"' showing total 43 results

Start Over Author "Christophe Calvin"

43 results on '"Christophe Calvin"'

1. Portable Monte Carlo Transport Performance Evaluation in the PATMOS Prototype.

Author: Tao Chang, Emeric Brun, and Christophe Calvin
Published: 2019
Full Text: View/download PDF

2. Efficient Cross Section Reconstruction on Modern Multi and Many Core Architectures.

Author: Yunsong Wang, François-Xavier Hugot, Emeric Brun, Fausto Malvagi, and Christophe Calvin
Published: 2017
Full Text: View/download PDF

3. Competing Energy Lookup Algorithms in Monte Carlo Neutron Transport Calculations and their Optimization on CPU and Intel MIC Architectures.

Author: Yunsong Wang, Emeric Brun, Fausto Malvagi, and Christophe Calvin
Published: 2016
Full Text: View/download PDF

4. Competing energy lookup algorithms in Monte Carlo neutron transport calculations and their optimization on CPU and Intel MIC architectures.

Author: Yunsong Wang, Emeric Brun, Fausto Malvagi, and Christophe Calvin
Published: 2017
Full Text: View/download PDF

5. An Efficient Task-Based Execution Model for Stochastic Linear Solver on Multi-core and Many-Core Systems.

Author: Fan Ye, Christophe Calvin, and Serge G. Petiton
Published: 2015
Full Text: View/download PDF

6. Toward Restarting Strategies Tuning for a Krylov Eigenvalue Solver.

Author: France Boillod-Cerneux, Serge G. Petiton, Christophe Calvin, and Leroy Anthony Drummond
Published: 2014
Full Text: View/download PDF

7. A Study of SpMV Implementation Using MPI and OpenMP on Intel Many-Core Architecture.

Author: Fan Ye, Christophe Calvin, and Serge G. Petiton
Published: 2014
Full Text: View/download PDF

8. The Exploration of Pervasive and Fine-Grained Parallel Model Applied on Intel Xeon Phi Coprocessor.

Author: Christophe Calvin, Fan Ye, and Serge G. Petiton
Published: 2013
Full Text: View/download PDF

9. Performance and Numerical Accuracy Evaluation of Heterogeneous Multicore Systems for Krylov Orthogonal Basis Computation.

Author: Jérôme Dubois, Christophe Calvin, and Serge G. Petiton
Published: 2010
Full Text: View/download PDF

10. Improving Scalability Using Hybrid Asynchronous Methods For Non-Hermitian Eigenproblems.

Author: Jérôme Dubois, Christophe Calvin, and Serge G. Petiton
Published: 2011
Full Text: View/download PDF

11. Accelerating the Explicitly Restarted Arnoldi Method with GPUs Using an Autotuned Matrix Vector Product.

Author: Jérôme Dubois, Christophe Calvin, and Serge G. Petiton
Published: 2011
Full Text: View/download PDF

12. The TRIO-Unitaire Project: A Parallel CFD 3-Dimensional Code.

Author: Christophe Calvin and Ph. Emonot
Published: 1997
Full Text: View/download PDF

13. Overlapping techniques of communications.

Author: Christophe Calvin, Laurent Colombet, and Philippe Michallon
Published: 1995
Full Text: View/download PDF

14. All-to-all broadcast in torus with wormhole-like routing.

Author: Christophe Calvin, Stéphane Pérennes, and Denis Trystram
Published: 1995
Full Text: View/download PDF

15. Towards Mixed Computation/Communication in Parallel Scientific Libraries.

Author: Christophe Calvin, Laurent Colombet, Frederic Desprez, B. Jargot, Philippe Michallon, Bernard Tourancheau, and Denis Trystram
Published: 1994
Full Text: View/download PDF

16. Minimizing Communication Overhead Using Pipelining for Multi-Dimensional FFT on Distributed Memory Machines.

Author: Christophe Calvin and Frederic Desprez
Published: 1993

17. Methods to Overlap Communications in Parallel Numerical Algorithms.

Author: Christophe Calvin, Laurent Colombet, and Philippe Michallon
Published: 1997
Full Text: View/download PDF

18. Matrix Transpose for Block Allocations on Torus and de Bruijn Networks.

Author: Christophe Calvin and Denis Trystram
Published: 1996
Full Text: View/download PDF

19. Implementation of Parallel FFT Algorithms on Distributed Memory Machines with a Minimum Overhed of Communication.

Author: Christophe Calvin
Published: 1996
Full Text: View/download PDF

20. Performance Evaluation and Modeling of Collective Communications on Cray T3D.

Author: Christophe Calvin and Laurent Colombet
Published: 1996
Full Text: View/download PDF

21. HPC and Data: When Two Becomes One

Author: Christophe Calvin and France Boillod-Cerneux
Subjects: Open science, Open data, Computer simulation, business.industry, Computer science, Science and engineering, Big data, Supercomputer, business, Data science
Abstract: As claimed for many years, High Performance Computing (HPC) and high performance numerical simulation are necessary tools for fundamental science and engineering. Big data and artificial intelligence are some newcomers in the landscape, but not that new, especially in science. Finally, open data and open science are becoming now mandatory for trustable and reproducible science.
Published: 2021
Full Text: View/download PDF

22. Turbulence and Interactions : Proceedings of the TI 2018 Conference, June 25-29, 2018, Les Trois-Îlets, Martinique, France

Author: Michel Deville, Christophe Calvin, Vincent Couaillier, Marta De La Llave Plata, Jean-Luc Estivalèzes, Thiên Hiêp Lê, Stéphane Vincent, Michel Deville, Christophe Calvin, Vincent Couaillier, Marta De La Llave Plata, Jean-Luc Estivalèzes, Thiên Hiêp Lê, and Stéphane Vincent
Subjects: Fluid mechanics, Continuum mechanics, Dynamics, Nonlinear theories, Thermodynamics, Heat engineering, Heat transfer, Mass transfer
Abstract: This book presents a snapshot of the state-of-art in the field of turbulence modeling, with an emphasis on numerical methods. Topics include direct numerical simulations, large eddy simulations, compressible turbulence, coherent structures, two-phase flow simulation and many more. It includes both theoretical contributions and experimental works, as well as chapters derived from keynote lectures, presented at the fifth Turbulence and Interactions Conference (TI 2018), which was held on June 25-29 in Martinique, France. This multifaceted collection, which reflects the conference´s emphasis on the interplay of theory, experiments and computing in the process of understanding and predicting the physics of complex flows and solving related engineering problems, offers a timely guide for students, researchers and professionals in the field of applied computational fluid dynamics, turbulence modeling and related areas.
Published: 2021

23. Portable Monte Carlo Transport Performance Evaluation in the PATMOS Prototype

Author: Emeric Brun, Christophe Calvin, Tao Chang, CEA-Direction des Energies (ex-Direction de l'Energie Nucléaire) (CEA-DES (ex-DEN)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA), Service des Réacteurs et de Mathématiques Appliquées (SERMA), Département de Modélisation des Systèmes et Structures (DM2S), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-CEA-Direction des Energies (ex-Direction de l'Energie Nucléaire) (CEA-DES (ex-DEN)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay, Direction de Recherche Fondamentale (CEA) (DRF (CEA)), Maison de la Simulation (MDLS), Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)-Institut National de Recherche en Informatique et en Automatique (Inria)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), and Service d’Études des Réacteurs et de Mathématiques Appliquées (SERMA)
Subjects: Neutron transport, [PHYS.NUCL]Physics [physics]/Nuclear Theory [nucl-th], Computer science, pseudo event-based method, 020209 energy, Monte Carlo method, history-based method, OpenMP offload, CUDA, 02 engineering and technology, Parallel computing, Thread (computing), ComputerSystemsOrganization_PROCESSORARCHITECTURES, Monte Carlo transport, [PHYS.NEXP]Physics [physics]/Nuclear Experiment [nucl-ex], 01 natural sciences, 010305 fluids & plasmas, OpenACC, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, Programming paradigm, OpenMP thread
Abstract: International audience; A heterogeneous offload version of Monte Carlo neutron transport has been developed in the framework of PATMOS prototype viaseveral programming models (OpenMP thread, OpenMP offload, OpenACC and CUDA). Two algorithms are implemented, including both history-based method and pseudo event-based method. A performanceevaluation has been carried out with a representative benchmark, slabAllNuclides. Numerical results illustrate the promising gain in performance for our heterogeneous offload MC code. These results demonstrate that pseudo event-based approach outperforms history-based approach significantly. Furthermore, by using pseudo event-based method, the OpenACC version is competitive enough, obtaining at least 71% performance comparing to the CUDA version, wherein the OpenMP offload version renders low performance for both approaches.
Published: 2019

24. The response matrix acceleration: A new non-linear method for the 3D discrete-ordinate transport equation

Author: François Févotte, Emiliano Masiello, Bruno Lathuilière, Wesley Ford, Christophe Calvin, CEA- Saclay (CEA), Commissariat à l'énergie atomique et aux énergies alternatives (CEA), Service d’Études des Réacteurs et de Mathématiques Appliquées (SERMA), Département de Modélisation des Systèmes et Structures (DM2S), CEA-Direction des Energies (ex-Direction de l'Energie Nucléaire) (CEA-DES (ex-DEN)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-CEA-Direction des Energies (ex-Direction de l'Energie Nucléaire) (CEA-DES (ex-DEN)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay, EDF (EDF), and This work has been jointly funded by Commissariat a l’Energie Atomique (CEA) and Electricite de France (EDF).
Subjects: CMFD, [PHYS]Physics [physics], Computer science, Spectral radius, non-linear acceleration, 020209 energy, Finite difference method, 02 engineering and technology, Solver, stability analysis, 7. Clean energy, 01 natural sciences, 010305 fluids & plasmas, Nonlinear system, Matrix (mathematics), Operator (computer programming), Nuclear Energy and Engineering, 0103 physical sciences, Convergence (routing), 0202 electrical engineering, electronic engineering, information engineering, Applied mathematics, Spectral radius analysis, Convection–diffusion equation, Discrete-ordinates transport equation
Abstract: International audience; In this paper, we propose a new non-linear technique for accelerating the solution of the discrete ordinates transport equation. The new method, called Response Matrix Acceleration (RMA), has been designed to complement the Coarse-Mesh Finite Difference method (CMFD) by offering better stability and improved performance in cases where CMFD fails. To accomplish this, RMA uses knowledge of the transport operator along with nonlinear coefficients and solves for the interface partial currents to maintain consistency with the transport operator. Two distinct variants of RMA are derived. The convergence properties of both variants of RMA applied the source iteration schemes are investigated for the one-group transport operator. Analysis of the results indicates that both variants of RMA have improved effectiveness and stability relative to CMFD, for optically diffusive materials. To achieve optimal numerical performance, a combination of RMA and CMFD is suggested. Improvements in the performance of RMA are expected with ongoing development and optimization. Further investigation into the use of RMA for accelerating outer iterations, parallel problems, and different transport operators is proposed. The results of a spectral radius analysis are presented, along with a strong scaling benchmark using the 3D C5G7 MOX problems. Furthermore, two real-scale problems, the wholecore EOLE reactor simulation and a PWR assembly simulation, are studied to assess the performances of the new method in a parallel computing framework using the constant and linear short characteristics of the IDT solver in APOLLO3
Published: 2020
Full Text: View/download PDF

25. A Spatially Variant Rebalancing Method forDiscrete-Ordinates Transport Equation

Author: Wesley Ford, Bruno Lathuilière, Christophe Calvin, Emiliano Masiello, François Févotte, CEA-Direction des Energies (ex-Direction de l'Energie Nucléaire) (CEA-DES (ex-DEN)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA), Service des Réacteurs et de Mathématiques Appliquées (SERMA), Département de Modélisation des Systèmes et Structures (DM2S), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-CEA-Direction des Energies (ex-Direction de l'Energie Nucléaire) (CEA-DES (ex-DEN)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay, EDF (EDF), This work has been jointly funded by Commissariat à l’Energie Atomique (CEA) and Electricité de France (EDF), and Service d’Études des Réacteurs et de Mathématiques Appliquées (SERMA)
Subjects: [PHYS.NUCL]Physics [physics]/Nuclear Theory [nucl-th], 020209 energy, Computation, Numerical analysis, Coarse-Mesh Finite Differences, 02 engineering and technology, Diffusion Synthetic Acceleration, [PHYS.NEXP]Physics [physics]/Nuclear Experiment [nucl-ex], 01 natural sciences, 010305 fluids & plasmas, symbols.namesake, Acceleration, Discrete Ordinates Transport Equation, Nuclear Energy and Engineering, Method of characteristics, Fourier analysis, 0103 physical sciences, Jacobian matrix and determinant, 0202 electrical engineering, electronic engineering, information engineering, symbols, Balance equation, Applied mathematics, Convection–diffusion equation, Mathematics, Coarse-mesh Rebalancing
Abstract: In this paper we propose a new non-linear technique for accelerating the source iterations of the discrete-ordinates transport equation. The acceleration method, called Spatially Variant Rebalancing Method (SVRM), is based on the computation of the zeroth and first order spatial variation of the neutron balance equation. The non-linear acceleration is applied to the method of characteristics (MOC) with a step-approximation of the source. The new acceleration is meant to catch the high-order variation of the neutron flux within the spatial mesh. The paper proposes a numerical analysis of the technique based on the explicit computation of the Jacobian. The latter is analyzed with both spectral and Fourier analysis (Hong et al., 2010). Also, a comparison of the new method against CMFD ( Smith, 1983 ), DSA (Larsen, 1982), and BPA (Adams et al., 1988) has been done for a parametrized heterogeneous problem, in order to study the performance of SVRM in different transport regimes. The analysis of SVRM has been constrained to plane geometries.
Published: 2019
Full Text: View/download PDF

26. Joint International Conference on Supercomputing in Nuclear Applications and Monte Carlo 2013 : Synthése

Author: Cheikh M. Diop and Christophe Calvin
Subjects: Computer science, Monte Carlo method, Joint (building), Supercomputer, Data science, Computational science
Published: 2014
Full Text: View/download PDF

27. HPC Challenges for Deterministic Neutronics Simulations Using APOLLO3® Code

Author: Christophe Calvin
Subjects: Neutron transport, Computer science, Frame (networking), Genetic algorithm, Code (cryptography), Domain decomposition methods, General Medicine, Parallel computing, General-purpose computing on graphics processing units, Massively parallel, Boltzmann equation
Abstract: The aim of this paper is to present some major HPC challenges for deterministic neutronics simulations and how these challenges are addressed in the APOLLO3 code. Different levels of HPC are illustrated on different kind of applications and parallel paradigms techniques in the frame of the APOLLO3 code. Results obtained for fuel load management using genetic algorithm, domain decomposition for transport solvers, GPU acceleration for the Boltzmann equation solution are given using from few cores to massively parallel computing using more than 10,000 cores.
Published: 2011
Full Text: View/download PDF

28. Numerical Platon: A unified linear equation solver interface by CEA for solving open foe scientific applications

Author: Bernard Sécher, Christophe Calvin, and Michel Belliard
Subjects: Nuclear and High Energy Physics, Engineering, business.industry, Interface (Java), Mechanical Engineering, Computation, Parallel computing, Division (mathematics), Linear equation solver, Software, Nuclear Energy and Engineering, Coupling (computer programming), General Materials Science, Linear solver, Safety, Risk, Reliability and Quality, business, Waste Management and Disposal, Massively parallel
Abstract: This paper describes a tool called ‘Numerical Platon’ developed by the French Atomic Energy Commission (CEA). It provides a freely available (GNU LGPL license) interface for coupling scientific computing applications to various freeware linear solver libraries (essentially PETSc, SuperLU and HyPre), together with some proprietary CEA solvers, for high-performance computers that may be used in industrial software written in various programming languages. This tool was developed as part of considerable efforts by the CEA Nuclear Energy Division in the past years to promote massively parallel software and on-shelf parallel tools to help develop new generation simulation codes. After the presentation of the package architecture and the available algorithms, we show examples of how Numerical Platon is used in sequential and parallel CEA codes. Comparing with in-house solvers, the gain in terms of increases in computation capacities or in terms of parallel performances is notable, without considerable extra development cost.
Published: 2009
Full Text: View/download PDF

29. Intel Xeon/Xeon Phi Platform Oriented Scalable Monte Carlo Linear Solver

Author: Ye, Fan, Christophe, Calvin, Serge, Petiton, Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), and Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)
Subjects: [INFO]Computer Science [cs], ComputingMilieux_MISCELLANEOUS
Abstract: International audience
Published: 2015

30. An object-oriented approach to the design of fluid mechanics software

Author: Philippe Emonot, Christophe Calvin, and Olga Cueto
Subjects: Numerical Analysis, Object-oriented programming, Guiding Principles, business.industry, Applied Mathematics, Fluid mechanics, Object-oriented design, Computational Mathematics, Software, Development (topology), Modeling and Simulation, Architecture, Software engineering, business, Software architecture, Analysis, Simulation, Mathematics
Abstract: This article presents the guiding principles of the architecture of Trio U, a new genera- tion of software for thermohydraulic calculations. Trio U is designed to serve as a thermohydraulic development platform. Its basic conception is object-oriented and it is written in C++. The article demonstrates how this type of design enables an open, modular software architecture.
Published: 2002
Full Text: View/download PDF

31. Efficient and Portable Krylov Eigensolver on Many Core Architectures

Author: Serge G. Petiton, F. Ye, F. Boillod-Cerneux, and Christophe Calvin
Subjects: Petascale computing, Class (computer programming), Many core, Computer science, Product (mathematics), Computer Science::Mathematical Software, Scalar (physics), Parallel computing, Solver, Xeon Phi, Eigenvalues and eigenvectors
Abstract: We present in this article a highly parallel Krylov solver for large eigenvalue problems, The Explicit Restarted Arnoldi Method (ERAM). Our ERAM implementation may be executed on many core configurations, both homogeneous and heterogeneous ones, in order to take advantage of most of present and future supercomputers. From these experiments, we propose our approach for designing efficient and portable algorithms on multi-core architectures. It is based on the design of generic algorithms using TRILINOS approach and specialized implementation of elementary operations (matrix-matrix, matrix-vector, scalar product ...) on accelerators mentioned above. Some results on large sparse and dense matrices on petascale class machines using CPU and GPUs, and some first results obtained on Intel MIC processor are presented and analysed.
Published: 2014
Full Text: View/download PDF

32. Multi level programming Paradigm for Extreme Computing

Author: Serge G. Petiton, Mitsuhisa Sato, Christophe Calvin, Nahid Emad, Miwako Tsuji, and Makarem Dandouna
Subjects: Concurrent object-oriented programming, Iterative method, Computer science, Block (programming), Reactive programming, Programming paradigm, Service-oriented programming, Parallel computing, Functional reactive programming, Exascale computing
Abstract: In order to propose a framework and programming paradigms for post-petascale computing, on the road to exascale computing and beyond, we introduced new languages, associated with a hierarchical multi-level programming paradigm, allowing scientific end-users and developers to program highly hierarchical architectures designed for extreme computing. In this paper, we explain the interest of such hierarchical multi-level programming paradigm for extreme computing and its well adaptation to several large computational science applications, such as for linear algebra solvers used for reactor core physic. We describe the YML language and framework allowing describing graphs of parallel components, which may be developed using PGAS-like language such as XMP, scheduled and computed on supercomputers. Then, we propose experimentations on supercomputers (such as the “K” and “Hooper” ones) of the hybrid method MERAM (Multiple Explicitly Restarted Arnoldi Method) as a case study for iterative methods manipulating sparse matrices, and the block Gauss-Jordan method as a case study for direct method manipulating dense matrices. We conclude proposing evolutions for this programming paradigm.
Published: 2014
Full Text: View/download PDF

33. Methods to Overlap Communications in Parallel Numerical Algorithms

Author: Philippe Michallon, Christophe Calvin, and Laurent Colombet
Subjects: Computer science, Fast Fourier transform, Computer Science (miscellaneous), Parallel algorithm, Distributed memory, Granularity, Parallel computing, Algorithm, Execution time, Intel Paragon
Abstract: We present in this paper general techniques for overlapping communications in parallel numerical kernels. We describe first some dependencies schemes which can be found in most of numerical parallel algorithms and we apply on these schemes methods based on the change of the granularity of the computational tasks. The choice of the granularity in order to obtain a good overlap depends on the main parameters of the target machines. So we present results of benchmarks executed on two parallel distributed memory machines: a Cray T3D and an Intel Paragon. Then we apply the precedent techniques of overlapping on classical numerical kernels, namely: the matrix-vector and the matrix-matrix products and the mono and bi-dimensional FFT. We have implemented to the overlapped versions of these algorithms on a T3D and a Paragon and tuned the parameters of overlapping in order to minimize the total execution time. The results of these experiments demonstrate the accuracy of this approach.
Published: 1997
Full Text: View/download PDF

34. Towards exascale with the ANR-JST Japanese-French Project FP3C

Author: Alfredo Buttari, Mitsuhisa Sato, Serge G. Petiton, Nahid Emad, Satoshi Matsuoka, M. Dayde, P. Codognet, Tetsuya Sakurai, Yutaka Ishikawa, Gabriel Antoniu, Christophe Calvin, Raymond Namyst, Taisuke Boku, Hiroshi Nakashima, Kengo Nakajima, and G. Joslin
Subjects: Runtime system, Software, Computer architecture, Parallel processing (DSP implementation), Exploit, business.industry, Computer science, Programming paradigm, Parallel computing, Architecture, business, Exascale computing
Abstract: The Japanese-french FP3C (Framework and Programming for Post-Petascale Computing) Project ANR/JST-2010-JTIC-003 aims at studying the software technologies, languages and programming models on the road to exascale computing. The ability to efficiency exploit these future systems is challenging because of their ultra large-scale and highly hierarchical architecture with computational nodes including many-core processors and accelerators. We give an overview of some of the main issues explored within the project.
Published: 2013
Full Text: View/download PDF

35. Accelerating the Explicitly Restarted Arnoldi Method with GPUs Using an Autotuned Matrix Vector Product

Author: Serge G. Petiton, Jérôme Dubois, Christophe Calvin, Commissariat à l’Energie Atomique, Gif-sur-Yvette, France, Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), and Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)
Subjects: Multi-core processor, Speedup, Computer science, Applied Mathematics, Process (computing), 010103 numerical & computational mathematics, 02 engineering and technology, Parallel computing, ComputerSystemsOrganization_PROCESSORARCHITECTURES, Solver, 01 natural sciences, 020202 computer hardware & architecture, Computational science, Arnoldi iteration, Computational Mathematics, Matrix (mathematics), Scalability, 0202 electrical engineering, electronic engineering, information engineering, [INFO]Computer Science [cs], 0101 mathematics, Graphics, ComputingMilieux_MISCELLANEOUS
Abstract: This paper presents a parallelized hybrid single-vector Arnoldi algorithm for computing approximations to eigenpairs of a nonsymmetric matrix. We are interested in the use of accelerators and multicore units to speed up the Arnoldi process. The main goal is to propose a parallel version of the Arnoldi solver, which can efficiently use multiple multicore processors or multiple graphics processing units (GPUs) in a mixed coarse and fine grain fashion. In the proposed algorithms, this is achieved by an autotuning of the matrix vector product before starting the Arnoldi eigensolver as well as the reorganization of the data and global communications so that communication time is reduced. The execution time, performance, and scalability are assessed with well-known dense and sparse test matrices on multiple Nehalems, GT200 NVidia Tesla, and next generation Fermi Tesla. With one processor, we see a performance speedup of 2 to 3x when using all the physical cores, and a total speedup of 2 to 8x when adding a GPU to this multicore unit, and hence a speedup of 4 to 24x compared to the sequential solver.
Published: 2011
Full Text: View/download PDF

36. High Performance Computing in Nuclear Engineering

Author: Christophe Calvin and David Nowak
Published: 2010
Full Text: View/download PDF

37. All-to-all broadcast in torus with wormhole-like routing

Author: S. Perennes, Christophe Calvin, and Denis Trystram
Subjects: Transformation (function), Dimension (vector space), Computer science, Value (computer science), Torus, Parallel computing, Routing (electronic design automation), Wormhole, Topology, Square (algebra), Power (physics)
Abstract: This paper deals with collective communications on distributed-memory parallel machines. We are interested in the design of efficient all-to-all broadcast algorithms on square torus of processing nodes using wormhole-like routing mechanism. The execution time is influenced by three factors, namely, the number of steps, the transmission rate and the maximum distance to cross. We first compute the lower bounds of the all-to-all broadcast problem under these assumptions. Then, we propose a new algorithm which minimizes the number of steps. Its distance factor is close to the optimal, but the transmission rate is too large. We derive a transformation which reduces significantly this last factor. This value is close to the optimum. This lost algorithm is a good trade-off when the message length is not too large. This analysis is detailed for square sizes of tori when the dimension is a power of 5. We show how to extend the construction for ether sizes of square tori.
Published: 2002
Full Text: View/download PDF

38. The trio-unitaire project: A parallel CFD 3-dimensional code

Author: Ph. Emonot and Christophe Calvin
Subjects: Structure (mathematical logic), Object-oriented programming, ComputerSystemsOrganization_COMPUTERSYSTEMIMPLEMENTATION, business.industry, Computer science, Code (cryptography), Parallelism (grammar), Domain decomposition methods, Parallel computing, Computational fluid dynamics, business, Algorithm, Cray t3e
Abstract: The structure of a new generation of thermalhydraulic code: Trio- Unitaire is presented in this paper. This code has been designed to solve large 3D structured or unstructured CFD problems. The solutions adopted to achieve this goal (object-oriented design and parallelism) are described and the paper focuses on the technical solutions used. Some preliminary experimental results on a Cray T3E are presented.
Published: 1997
Full Text: View/download PDF

39. Overlapping techniques of communications

Author: Laurent Colombet, Philippe Michallon, and Christophe Calvin
Subjects: Computer science, Order (business), Product (mathematics), Fast Fourier transform, Parallel algorithm, Algorithm
Abstract: We present in this paper general techniques for overlapping communications in parallel numerical kernels. We describe first some dependencies schemes which can be found in most of numerical parallel algorithms and we apply on these schemes methods based on the change of the granularity of the computational tasks. The choice of the granularity in order to obtain a good overlap depends on the main parameters of the target machines. We apply the precedent techniques of overlapping on classical numerical kernels, namely: the matrix-vector product and the bi-dimensional FFT, and implemented them on a T3D and a Paragon. The results of these experiments demonstrate the accuracy of this approach.
Published: 1995
Full Text: View/download PDF

40. Evaluation of programming models for manycore and / or heterogeneous architectures for Monte Carlo neutron transport codes

Author: Chang, Tao, Département de Modélisation des Systèmes et Structures (DM2S), CEA-Direction des Energies (ex-Direction de l'Energie Nucléaire) (CEA-DES (ex-DEN)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay, Institut Polytechnique de Paris, and Christophe Calvin
Subjects: Manycore architectures, Architectures manycore, Transport des particules, Heterogeneous architectures, Particules transport, [SCCO.COMP]Cognitive science/Computer science, Architectures hétérogènes
Abstract: In this thesis we propose to evaluate the different programming models available for addressing manycore and / or heterogeneous architectures within the framework of the Monte Carlo transport codes. A simple but representative application test case will be considered in order to cover a fairly wide range of solutions and compare them in terms of performance, portability of performance, ease of implementation and maintainability. The target architectures are `classic' CPUs, Intel Xeon Phi and GPUs. The most relevant programming models will then be set up in a Monte Carlo transport code.; Dans cette thèse nous nous proposons d’évaluer les différents modèles de programmation disponibles pour adresser les architectures de type manycore et/ou hétérogènes dans le cadre des codes de transport Monte Carlo. On considèrera dans un premier temps un cas test d’application simple mais représentatif pour couvrir un éventail assez large de solutions et les comparer en terme de performance, de portabilité de la performance, de facilité de mise en œuvre et de maintenabilité. Les architectures cibles sont les CPU `classique', Intel Xeon Phi et GPU. Les modèles de programmation les plus pertinents seront ensuite mis en place dans un code de transport Monte Carlo.
Published: 2020

41. Evaluation de modèles de programmation pour les architectures manycore et/ou hétérogènes pour les codes de transport neutronique Monte Carlo

Author: Chang, Tao, Département de Modélisation des Systèmes et Structures (DM2S), CEA-Direction des Energies (ex-Direction de l'Energie Nucléaire) (CEA-DES (ex-DEN)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay, Institut Polytechnique de Paris, and Christophe Calvin
Subjects: Manycore architectures, Architectures manycore, Transport des particules, Heterogeneous architectures, Particules transport, [SCCO.COMP]Cognitive science/Computer science, Architectures hétérogènes
Abstract: In this thesis we propose to evaluate the different programming models available for addressing manycore and / or heterogeneous architectures within the framework of the Monte Carlo transport codes. A simple but representative application test case will be considered in order to cover a fairly wide range of solutions and compare them in terms of performance, portability of performance, ease of implementation and maintainability. The target architectures are `classic' CPUs, Intel Xeon Phi and GPUs. The most relevant programming models will then be set up in a Monte Carlo transport code.; Dans cette thèse nous nous proposons d’évaluer les différents modèles de programmation disponibles pour adresser les architectures de type manycore et/ou hétérogènes dans le cadre des codes de transport Monte Carlo. On considèrera dans un premier temps un cas test d’application simple mais représentatif pour couvrir un éventail assez large de solutions et les comparer en terme de performance, de portabilité de la performance, de facilité de mise en œuvre et de maintenabilité. Les architectures cibles sont les CPU `classique', Intel Xeon Phi et GPU. Les modèles de programmation les plus pertinents seront ensuite mis en place dans un code de transport Monte Carlo.
Published: 2020

42. The Advancement of Stable, Efficient and Parallel Acceleration Methods for the Neutron Transport Equation

Author: Ford, Wesley, STAR, ABES, Département de Modélisation des Systèmes et Structures (DM2S), CEA-Direction des Energies (ex-Direction de l'Energie Nucléaire) (CEA-DES (ex-DEN)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay, Université Paris Saclay (COmUE), and Christophe Calvin
Subjects: Equation de transport en coordonnées discrètes, Analyse de la stabilité, Cmfd, Analyse du rayon spectral, [SPI.NRJ]Engineering Sciences [physics]/Electric power, [PHYS.PHYS.PHYS-COMP-PH] Physics [physics]/Physics [physics]/Computational Physics [physics.comp-ph], Stability analysis, Accélération non linéaire, [PHYS.PHYS.PHYS-COMP-PH]Physics [physics]/Physics [physics]/Computational Physics [physics.comp-ph], [MATH.MATH-MP]Mathematics [math]/Mathematical Physics [math-ph], Non-linear acceleration, Spectral radius analysis, [MATH.MATH-MP] Mathematics [math]/Mathematical Physics [math-ph], Discrete-ordinates transport equation, [SPI.NRJ] Engineering Sciences [physics]/Electric power
Abstract: In this paper we propose a new library of non-linear techniques for accelerating the discrete-ordinates transport equation. Two new types of nonlinear acceleration methods called Spatially Variant Rebalancing Method (SVRM) and Response Matrix Acceleration (RMA), respectively, are proposed and investigated. The first method, SVRM, is based on the computation of the zeroth and first order spatial variation of the neutron balance equation. RMA, is a DP0 method that uses knowledge of the transport operator to form a consistent relationship. Two distinct variants of RMA, called Explicit-RMA (E-RMA) and Balance (B-RMA), respectively, are derived. The convergence properties of both acceleration methods are investigated for two different iteration schemes of the method of characteristics (MOC) transport operator for a 1D slab, using spectral and Fourier analysis. Based off the results of the 1D comparison, only RMA and CMFD were implemented in the library. The performance of RMA is compared to CMFD using the C5G7, ZPPR, and UH12 3D benchmarks. Both parallel and sequential solving schemes are considered. Analysis of the results indicates that both variants of RMA have improved effectiveness and stability relative to CMFD, for optically diffusive materials. Moreover, RMA shows great improvement in stability and effectiveness when the geometry is spatially decomposed. To achieve optimal numerical performance, a combination of RMA and CMFD is suggested. Further investigation into the use and improvement of RMA is proposed. As well, many ideas for extending the features of the library are presented., Dans cet article, nous proposons une nouvelle bibliothèque de techniques non linéaires pour accélérer l’équation de transport en ordonnées discrètes. Deux nouveaux types de méthodes d'accélération non linéaire appelées méthode de rééquilibrage spatialement variable (SVRM) et accélération de matrice de réponse (RMA), respectivement, sont proposées et étudiées. La première méthode, SVRM, est basée sur le calcul de la variation spatiale de premier ordre de l'équation de la balance des neutrons. RMA, est une méthode DP0 qui utilise la connaissance de l'opérateur de transport pour former une relation cohérente. Deux variantes distinctes de RMA, appelées respectivement Explicit-RMA (E-RMA) et Balance (B-RMA), sont dérivées. Les propriétés de convergence des deux méthodes d'accélération sont étudiées pour deux schémas d'itération différents de l'opérateur de transport de la méthode des caractéristiques (MOC) pour une dalle 1D, en utilisant une analyse spectrale et une analyse de Fourier. Sur la base des résultats de la comparaison 1D, seuls les outils RMA et CMFD ont été implémentés dans la bibliothèque. Les performances de RMA sont comparées à celles de CMFD en utilisant les tests 3D C5G7, ZPPR et UH12. Les schémas de résolution parallèles et séquentiels sont considérés. L'analyse des résultats indique que les deux variantes de RMA ont une efficacité et une stabilité améliorées par rapport au CMFD, pour les matériaux à diffusion optique. De plus, le RMA montre une amélioration importante de la stabilité et de l'efficacité lorsque la géométrie est décomposée spatialement. Pour obtenir des performances numériques optimales, une combinaison de RMA et de CMFD est suggérée. Une enquête plus approfondie sur l'utilisation et l'amélioration de la RMA est proposée. De plus, de nombreuses idées pour étendre les fonctionnalités de la bibliothèque sont présentées.
Published: 2019

43. Optimization of Monte Carlo Neutron Transport Simulations with Emerging Architectures

Author: Wang, Yunsong, Maison de la Simulation (MDLS), Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)-Institut National de Recherche en Informatique et en Automatique (Inria)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Université Paris Saclay (COmUE), Christophe Calvin, and Centre National de la Recherche Scientifique (CNRS)-Université Paris-Saclay-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)
Subjects: Parallel computing, Neutron transport, Cross section, Vectorisation, Vectorization, Transport de neutron, Section efficace, Mic, [INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC], Programmation Parallèle, Monte Carlo
Abstract: Monte Carlo (MC) neutron transport simulations are widely used in the nuclear community to perform reference calculations with minimal approximations. The conventional MC method has a slow convergence according to the law of large numbers, which makes simulations computationally expensive. Cross section computation has been identified as the major performance bottleneck for MC neutron code. Typically, cross section data are precalculated and stored into memory before simulations for each nuclide, thus during the simulation, only table lookups are required to retrieve data from memory and the compute cost is trivial. We implemented and optimized a large collection of lookup algorithms in order to accelerate this data retrieving process. Results show that significant speedup can be achieved over the conventional binary search on both CPU and MIC in unit tests other than real case simulations. Using vectorization instructions has been proved effective on many-core architecture due to its 512-bit vector units; on CPU this improvement is limited by a smaller register size. Further optimization like memory reduction turns out to be very important since it largely improves computing performance. As can be imagined, all proposals of energy lookup are totally memory-bound where computing units does little things but only waiting for data. In another word, computing capability of modern architectures are largely wasted. Another major issue of energy lookup is that the memory requirement is huge: cross section data in one temperature for up to 400 nuclides involved in a real case simulation requires nearly 1 GB memory space, which makes simulations with several thousand temperatures infeasible to carry out with current computer systems.In order to solve the problem relevant to energy lookup, we begin to investigate another on-the-fly cross section proposal called reconstruction. The basic idea behind the reconstruction, is to do the Doppler broadening (performing a convolution integral) computation of cross sections on-the-fly, each time a cross section is needed, with a formulation close to standard neutron cross section libraries, and based on the same amount of data. The reconstruction converts the problem from memory-bound to compute-bound: only several variables for each resonance are required instead of the conventional pointwise table covering the entire resolved resonance region. Though memory space is largely reduced, this method is really time-consuming. After a series of optimizations, results show that the reconstruction kernel benefits well from vectorization and can achieve 1806 GFLOPS (single precision) on a Knights Landing 7250, which represents 67% of its effective peak performance. Even if optimization efforts on reconstruction significantly improve the FLOP usage, this on-the-fly calculation is still slower than the conventional lookup method. Under this situation, we begin to port the code on GPGPU to exploit potential higher performance as well as higher FLOP usage. On the other hand, another evaluation has been planned to compare lookup and reconstruction in terms of power consumption: with the help of hardware and software energy measurement support, we expect to find a compromising solution between performance and energy consumption in order to face the "power wall" challenge along with hardware evolution.; L’accès aux données de base, que sont les sections efficaces, constitue le principal goulot d’étranglement aux performances dans la résolution des équations du transport neutronique par méthode Monte Carlo (MC). Ces sections efficaces caractérisent les probabilités de collisions des neutrons avec les nucléides qui composent le matériau traversé. Elles sont propres à chaque nucléide et dépendent de l’énergie du neutron incident et de la température du matériau. Les codes de référence en MC chargent ces données en mémoire à l’ensemble des températures intervenant dans le système et utilisent un algorithme de recherche binaire dans les tables stockant les sections. Sur les architectures many-coeurs (typiquement Intel MIC), ces méthodes sont dramatiquement inefficaces du fait des accès aléatoires à la mémoire qui ne permettent pas de profiter des différents niveaux de cache mémoire et du manque de vectorisation de ces algorithmes.Tout le travail de la thèse a consisté, dans une première partie, à trouver des alternatives à cet algorithme de base en proposant le meilleur compromis performances/occupation mémoire qui tire parti des spécificités du MIC (multithreading et vectorisation). Dans un deuxième temps, nous sommes partis sur une approche radicalement opposée, approche dans laquelle les données ne sont pas stockées en mémoire, mais calculées à la volée. Toute une série d’optimisations de l’algorithme, des structures de données, vectorisation, déroulement de boucles et influence de la précision de représentation des données, ont permis d’obtenir des gains considérables par rapport à l’implémentation initiale.En fin de compte, une comparaison a été effectué entre les deux approches (données en mémoire et données calculées à la volée) pour finalement proposer le meilleur compromis en termes de performance/occupation mémoire. Au-delà de l'application ciblée (le transport MC), le travail réalisé est également une étude qui peut se généraliser sur la façon de transformer un problème initialement limité par la latence mémoire (« memory latency bound ») en un problème qui sature le processeur (« CPU-bound ») et permet de tirer parti des architectures many-coeurs.
Published: 2017

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Database

Publisher

43 results on '"Christophe Calvin"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources