87 results on '"James C. Sexton"'
Search Results
2. Heterogeneous Computing Systems for Complex Scientific Discovery Workflows.
- Author
-
Christoph Hagleitner, Dionysios Diamantopoulos, Burkhard Ringlein, Constantinos Evangelinos, Charles R. Johns, Rong N. Chang, Bruce D'Amora, James A. Kahle, James C. Sexton, Michael Johnston, Edward Pyzer-Knapp, and Chris Ward
- Published
- 2021
- Full Text
- View/download PDF
3. An Architecture for Heterogeneous High-Performance Computing Systems: Motivation and Requirements.
- Author
-
Christoph Hagleitner, Florian Auernhammer, James C. Sexton, Constantinos Evangelinos, Guerney Hunt, Christian Pinto, Michael Johnston, Charles R. Johns, and Jim Kahle
- Published
- 2023
- Full Text
- View/download PDF
4. Preparation and optimization of a diverse workload for a large-scale heterogeneous system.
- Author
-
Ian Karlin, Yoonho Park, Bronis R. de Supinski, Peng Wang, Bert Still, David Beckingsale, Robert Blake, Tong Chen 0001, Guojing Cong, Carlos H. A. Costa, Johann Dahm, Giacomo Domeniconi, Thomas Epperly, Aaron Fisher, Sara Kokkila Schumacher, Steven H. Langer, Hai Le, Eun Kyung Lee, Naoya Maruyama, Xinyu Que, David F. Richards, Björn Sjögreen, Jonathan Wong, Carol S. Woodward, Ulrike Meier Yang, Xiaohua Zhang, Bob Anderson, David Appelhans, Levi Barnes, Peter D. Barnes Jr., Sorin Bastea, David Böhme, Jamie A. Bramwell, James M. Brase, José R. Brunheroto, Barry Chen, Charway R. Cooper, Tony Degroot, Robert D. Falgout, Todd Gamblin, David J. Gardner, James N. Glosli, John A. Gunnels, Max P. Katz, Tzanio V. Kolev, I-Feng W. Kuo, Matthew P. LeGendre, Ruipeng Li, Pei-Hung Lin, Shelby Lockhart, Kathleen McCandless, Claudia Misale, Jaime H. Moreno, Rob Neely, Jarom Nelson, Rao Nimmakayala, Kathryn M. O'Brien, Kevin O'Brien, Ramesh Pankajakshan, Roger Pearce, Slaven Peles, Phil Regier, Steven C. Rennich, Martin Schulz 0001, Howard Scott, James C. Sexton, Kathleen Shoga, Shiv Sundram, Guillaume Thomas-Collignon, Brian Van Essen, Alexey Voronin, Bob Walkup, Lu Wang, Chris Ward, Hui-Fang Wen, Daniel A. White, Christopher Young, Cyril Zeller, and Edward Zywicz
- Published
- 2019
- Full Text
- View/download PDF
5. The design, deployment, and evaluation of the CORAL pre-exascale systems.
- Author
-
Sudharshan S. Vazhkudai, Bronis R. de Supinski, Arthur S. Bland, Al Geist, James C. Sexton, Jim Kahle, Christopher Zimmer 0001, Scott Atchley, Sarp Oral, Don E. Maxwell, Verónica G. Vergara Larrea, Adam Bertsch, Robin Goldstone, Wayne Joubert, Chris Chambreau, David Appelhans, Robert Blackmore, Ben Casses, George Chochia, Gene Davison, Matthew A. Ezell, Tom Gooding, Elsa Gonsiorowski, Leopold Grinberg, Bill Hanson, Bill Hartner, Ian Karlin, Matthew L. Leininger, Dustin Leverman, Chris Marroquin, Adam Moody, Martin Ohmacht, Ramesh Pankajakshan, Fernando Pizzano, James H. Rogers, Bryan S. Rosenburg, Drew Schmidt, Mallikarjun Shankar, Feiyi Wang, Py Watson, Bob Walkup, Lance D. Weems, and Junqi Yin
- Published
- 2018
6. A Performance Counter Based Workload Characterization on Blue Gene/P.
- Author
-
Karthik Ganesan 0006, Lizy Kurian John, Valentina Salapura, and James C. Sexton
- Published
- 2008
- Full Text
- View/download PDF
7. Next-Generation Performance Counters: Towards Monitoring Over Thousand Concurrent Events.
- Author
-
Valentina Salapura, Karthik Ganesan 0006, Alan Gara, Michael Gschwind, James C. Sexton, and Robert Walkup
- Published
- 2008
- Full Text
- View/download PDF
8. Minimal Data Copy for Dense Linear Algebra Factorization.
- Author
-
Fred G. Gustavson, John A. Gunnels, and James C. Sexton
- Published
- 2006
- Full Text
- View/download PDF
9. Early Experience with Scientific Applications on the Blue Gene/L Supercomputer.
- Author
-
George Almási 0001, Gyan Bhanot, Dong Chen 0005, Maria Eleftheriou, Blake G. Fitch, Alan Gara, Robert S. Germain, John A. Gunnels, Manish Gupta 0002, Philip Heidelberger, Michael Pitman, Aleksandr Rayshubskiy, James C. Sexton, Frank Suits, Pavlos Vranas, Robert Walkup, T. J. Christopher Ward, Yuriy Zhestkov, Alessandro Curioni, Wanda Andreoni, Charles Archer, José E. Moreira, Richard Loft, Henry M. Tufo, Theron Voran, and Katherine Riley
- Published
- 2005
- Full Text
- View/download PDF
10. Scaling physics and material science applications on a massively parallel Blue Gene/L system.
- Author
-
George Almási 0001, Gyan Bhanot, Alan Gara, Manish Gupta 0002, James C. Sexton, Robert Walkup, Vasily V. Bulatov, Andrew W. Cook, Bronis R. de Supinski, James N. Glosli, Jeffrey A. Greenough, François Gygi, Alison Kubota, Steve Louis, Thomas E. Spelce, Frederick H. Streitz, Peter L. Williams, Robert K. Yates, Charles Archer, José E. Moreira, and Charles A. Rendleman
- Published
- 2005
- Full Text
- View/download PDF
11. Enabling High-Performance Computing as a Service.
- Author
-
Moustafa AbdelBaky, Manish Parashar, Hyunjoo Kim, Kirk E. Jordan, Vipin Sachdeva, James C. Sexton, Hani Jamjoom, Zon-Yin Shae, Gergina Pencheva, Reza Tavakoli, and Mary F. Wheeler
- Published
- 2012
- Full Text
- View/download PDF
12. BlueGene/L applications: Parallelism On a Massive Scale.
- Author
-
Bronis R. de Supinski, Martin Schulz 0001, Vasily V. Bulatov, William H. Cabot, Bor Chan, Andrew W. Cook, Erik W. Draeger, James N. Glosli, Jeffrey A. Greenough, Keith W. Henderson, Alison Kubota, Steve Louis, Brian J. Miller, Mehul V. Patel, Thomas E. Spelce, Frederick H. Streitz, Peter L. Williams, Robert K. Yates, Andy Yoo, George Almási 0001, Gyan Bhanot, Alan Gara, John A. Gunnels, Manish Gupta 0002, José E. Moreira, James C. Sexton, Robert Walkup, Charles Archer, François Gygi, Timothy C. Germann, Kai Kadau, Peter S. Lomdahl, Charles A. Rendleman, Michael L. Welcome, William McLendon, Bruce Hendrickson, Franz Franchetti, Stefan Kral, Juergen Lorenz, Christoph W. überhuber, Edmond Chow, and ümit V. çatalyürek
- Published
- 2008
- Full Text
- View/download PDF
13. Massively parallel quantum chromodynamics.
- Author
-
Pavlos Vranas, Matthias A. Blumrich, Dong Chen 0005, Alan Gara, Mark Giampapa, Philip Heidelberger, Valentina Salapura, James C. Sexton, Ron Soltz, and Gyan Bhanot
- Published
- 2008
- Full Text
- View/download PDF
14. Optimizing task layout on the Blue Gene/L supercomputer.
- Author
-
Gyan Bhanot, Alan Gara, Philip Heidelberger, Eoin Lawless, James C. Sexton, and Robert Walkup
- Published
- 2005
- Full Text
- View/download PDF
15. HPCS 2012 keynotes: Tuesday keynote: Europe back in the HPC race: Building a European ecosystem to recover and maintain the capacity of designing and building large computers.
- Author
-
Jean Gonnord, Felix Schürmann, James C. Sexton, and Jesús Labarta
- Published
- 2012
- Full Text
- View/download PDF
16. HPCS 2012 panels: Panel I: Energy efficient systems in next generation high performance data and compute centers.
- Author
-
Laurent Lefèvre, Vicente Martin, Miguel A. Ordonez, Johnatan E. Pecero, Jean-Marc Pierson, Jesús Carretero 0001, Pascal Bouvry, David R. C. Hill, Jesús Labarta, Reinhard Schneider 0002, James C. Sexton, Mads Nygård, Gorka Esnal Lopez, Maria Mirto, Marco Passante, Giovanni Aloisio, Carsten Trinitis, Alexander Heinecke, and Lamia Djoudi
- Published
- 2012
- Full Text
- View/download PDF
17. Heterogeneous Computing Systems for Complex Scientific Discovery Workflows
- Author
-
Michael A. Johnston, Edward O. Pyzer-Knapp, Constantinos Evangelinos, James C. Sexton, Burkhard Ringlein, Charles Ray Johns, Christopher Ward, James Allan Kahle, Dionysios Diamantopoulos, Rong N. Chang, Christoph Hagleitner, and Bruce D'Amora
- Subjects
Workflow ,Green computing ,Computer science ,Distributed computing ,Systems architecture ,Systems design ,Symmetric multiprocessor system ,Supercomputer ,Heterogeneous network ,Efficient energy use - Abstract
With Moore's law progressively running out of steam, heterogeneous computing architectures have been powering the top supercomputers in the world for many years and are now finding broader adoption across the industry. The trend towards sustainable computing also requires domain-specific heterogeneous hardware architectures, which promise further gains in energy efficiency. At the same time, today's high performance computing applications have evolved from monolithic simulations in a single domain to multidisciplinary complex workflows. In this paper, we explore how these trends affect system design decisions and what this means for future computing system architectures.
- Published
- 2021
- Full Text
- View/download PDF
18. Gordon Bell finalists II - The BlueGene/L supercomputer and quantum ChromoDynamics.
- Author
-
Pavlos Vranas, Gyan Bhanot, Matthias A. Blumrich, Dong Chen 0005, Alan Gara, Philip Heidelberger, Valentina Salapura, and James C. Sexton
- Published
- 2006
- Full Text
- View/download PDF
19. Gordon Bell finalists I - Large-scale electronic structure calculations of high-Z metals on the BlueGene/L platform.
- Author
-
François Gygi, Erik W. Draeger, Martin Schulz 0001, Bronis R. de Supinski, John A. Gunnels, Vernon Austel, James C. Sexton, Franz Franchetti, Stefan Kral, Christoph W. Ueberhuber, and Juergen Lorenz
- Published
- 2006
- Full Text
- View/download PDF
20. Preparation and optimization of a diverse workload for a large-scale heterogeneous system
- Author
-
Martin Schulz, Ulrike Meier Yang, David F. Richards, Tong Chen, Shiv Sundram, Todd Gamblin, Shelby Lockhart, Phil Regier, David Beckingsale, Ed Zywicz, Ruipeng Li, Giacomo Domeniconi, James C. Sexton, Bob Walkup, Jarom Nelson, Carlos Costa, Hui-Fang Wen, Ramesh Pankajakshan, John A. Gunnels, Xiaohua Zhang, Brian Van Essen, Kathryn M. O'Brien, I-Feng W. Kuo, Johann Dahm, Guillaume Thomas-Collignon, Bert Still, Naoya Maruyama, Jamie A. Bramwell, David Boehme, Kathleen Shoga, Carol S. Woodward, Howard A. Scott, M. P. Katz, Ian Karlin, T Epperly, Tzanio V. Kolev, Eun Kyung Lee, Steven H. Langer, Christopher Ward, David J. Gardner, Sara I. L. Kokkila-Schumacher, Christopher Young, Kevin O'Brien, Barry Chen, Björn Sjögreen, Jose R. Brunheroto, Claudia Misale, Roger Pearce, Guojing Cong, Matthew Legendre, Lu Wang, Jaime H. Moreno, Kathleen McCandless, Cyril Zeller, Rao Nimmakayala, Bronis R. de Supinski, Xinyu Que, Sorin Bastea, Robert D. Falgout, Peng Wang, Charway R. Cooper, Aaron Fisher, Jim Brase, R. Neely, David Appelhans, Alexey Voronin, James N. Glosli, Slaven Peles, Pei-Hung Lin, Tony Degroot, Hai Le, Daniel A. White, Levi Barnes, Steve Rennich, Yoonho Park, Peter D. Barnes, Bob Anderson, Jonathan J. Wong, and Robert C. Blake
- Subjects
020203 distributed computing ,geography ,Summit ,geography.geographical_feature_category ,Computer science ,business.industry ,Emerging technologies ,Scale (chemistry) ,Center of excellence ,Workload ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,Engineering management ,0202 electrical engineering, electronic engineering, information engineering ,Systems architecture ,Programming paradigm ,Project management ,business ,0105 earth and related environmental sciences - Abstract
Productivity from day one on supercomputers that leverage new technologies requires significant preparation. An institution that procures a novel system architecture often lacks sufficient institutional knowledge and skills to prepare for it. Thus, the "Center of Excellence" (CoE) concept has emerged to prepare for systems such as Summit and Sierra, currently the top two systems in the Top 500. This paper documents CoE experiences that prepared a workload of diverse applications and math libraries for a heterogeneous system. We describe our approach to this preparation, including our management and execution strategies, and detail our experiences with and reasons for using different programming approaches. Our early science and performance results show that the project enabled significant early seismic science with up to a l4X throughput increase over Cori. In addition to our successes, we discuss our challenges and failures so others may benefit from our experience.
- Published
- 2019
- Full Text
- View/download PDF
21. Large-Scale First-Principles Molecular Dynamics simulations on the BlueGene/L Platform using the Qbox code.
- Author
-
François Gygi, Robert K. Yates, Juergen Lorenz, Erik W. Draeger, Franz Franchetti, Christoph W. Ueberhuber, Bronis R. de Supinski, Stefan Kral, John A. Gunnels, and James C. Sexton
- Published
- 2005
- Full Text
- View/download PDF
22. The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems
- Author
-
Bill Hanson, Chris Marroquin, Martin Ohmacht, Sarp Oral, Tom Gooding, Feiyi Wang, Adam Moody, Mallikarjun Shankar, Junqi Yin, Ben Casses, Gene Davison, Sudharshan S. Vazhkudai, David Appelhans, Arthur S. Bland, Ian Karlin, Verónica G. Vergara Larrea, Al Geist, James H. Rogers, Py C. Watson, Chris Chambreau, Bronis R. de Supinski, Robert S. Blackmore, Fernando Pizzano, Matthew L. Leininger, Elsa Gonsiorowski, J. Kahle, Lance D. Weems, Drew Schmidt, Bryan S. Rosenburg, Leopold Grinberg, Scott Atchley, Bob Walkup, Ramesh Pankajakshan, Wayne Joubert, Don Maxwell, James C. Sexton, Dustin Leverman, Adam Bertsch, Matthew A Ezell, Bill Hartner, Christopher Zimmer, George Chochia, and Robin Goldstone
- Subjects
File system ,geography ,TOP500 ,Speedup ,Summit ,geography.geographical_feature_category ,Computer science ,020206 networking & telecommunications ,02 engineering and technology ,computer.software_genre ,01 natural sciences ,Titan (supercomputer) ,Software deployment ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Operating system ,Benchmark (computing) ,IBM ,010306 general physics ,computer - Abstract
CORAL, the Collaboration of Oak Ridge, Argonne and Livermore, is fielding two similar IBM systems, Summit and Sierra, with NVIDIA GPUs that will replace the existing Titan and Sequoia systems. Summit and Sierra are currently ranked No. 1 and No. 3, respectively on the Top500 list. We discuss the design and key differences of the systems. Our evaluation of the systems highlights the following. Applications that fit in HBM see the most benefit and may prefer more GPUs; however, for some applications, the CPU-GPU bandwidth is more important than the number of GPUs. The node-local burst buffer scales linearly, and can achieve a 4X improvement over the parallel file system for large jobs; smaller jobs, however, may benefit from writing directly to the PFS. Finally, several CPU, network and memory bound analytics and GPU-bound deep learning codes achieve up to a 11X and 79X speedup/node, respectively over Titan.
- Published
- 2018
- Full Text
- View/download PDF
23. Effects of nonperturbatively improved dynamical fermions in QCD at fixed lattice spacing
- Author
-
M. Talevi, James C. Sexton, Craig McNeile, Chris Michael, A. Hart, Kieran J. Sharkey, Balint Joo, Christopher M. Maynard, Joyce Garden, Z. Sroczynski, Chris Allton, Stephen Pickles, Stephen Booth, K.C. Bowler, Richard Kenway, D. Hepburn, Michael Teper, Hartmut Wittig, and Alan C. Irving
- Subjects
Physics ,Quantum chromodynamics ,Quark ,Nuclear and High Energy Physics ,Particle physics ,High Energy Physics::Lattice ,High Energy Physics - Lattice (hep-lat) ,Hadron ,High Energy Physics::Phenomenology ,FOS: Physical sciences ,Fermion ,Pseudoscalar ,High Energy Physics - Lattice ,Lattice constant ,Lattice gauge theory ,Lattice (order) ,High Energy Physics::Experiment ,QC - Abstract
We present results for the static inter-quark potential, lightest glueballs, light hadron spectrum and topological susceptibility using a non-perturbatively improved action on a $16^3\times 32$ lattice at a set of values of the bare gauge coupling and bare dynamical quark mass chosen to keep the lattice size fixed in physical units ($\sim 1.7$ fm). By comparing these measurements with a matched quenched ensemble, we study the effects due to two degenerate flavours of dynamical quarks. With the greater control over residual lattice spacing effects which these methods afford, we find some evidence of charge screening and some minor effects on the light hadron spectrum over the range of quark masses studied ($M_{PS}/M_{V}\ge0.58$). More substantial differences between quenched and unquenched simulations are observed in measurements of topological quantities., Comment: 53 pages, LaTeX/RevTeX, 16 eps figures; corrected clover action expression and various typos, no results changed
- Published
- 2016
24. EARLY EXPERIENCES WITH THE 360TF IBM BLUE GENE/L PLATFORM
- Author
-
A. A. Wyszogrodski, Robert E. Walkup, T. Voran, Amik St-Cyr, John M. Dennis, J. Edwards, Stephen J. Thomas, Henry M. Tufo, Gyan Bhanot, Wojciech W. Grabowski, Manish Gupta, K. Jordan, James C. Sexton, and Richard Loft
- Subjects
Computational Mathematics ,Coprocessor ,Computer science ,Component (UML) ,Primitive equations ,Scalability ,Computer Science (miscellaneous) ,Community Climate System Model ,Node (circuits) ,Parallel computing ,Atmospheric model ,Parametrization - Abstract
The High Order Method Modeling Environment is a scalable, spectral-element-based prototype for the Community Atmospheric Model component of the Community Climate System Model. The 3D moist primitive equations are solved on the cubed sphere with a hybrid pressure η vertical coordinate using an Emanuel convective parametrization for moist processes. Semi-implicit time integration, based on a preconditioned conjugate gradient solver, circumvents the time step restrictions associated with gravity waves. Benchmarks for two standard tests problems at 10 km horizontal resolution have been run on Blue Gene/L. Results obtained on a 32-rack Blue Gene/L system (65,536 processors, 183.5-teraflop peak) show sustained performance of 8.0 teraflops on 32,768 processors for the moist Held–Suarez test problem in coprocessor mode and 11.3 teraflops on 32,768 processors for the aquaplanet test problem, running in virtual node mode.
- Published
- 2008
- Full Text
- View/download PDF
25. BlueGene/L applications: Parallelism On a Massive Scale
- Author
-
Ümit V. Çatalyürek, Mehul Patel, Alan Gara, Robert K. Yates, Martin Schulz, José E. Moreira, Bor Chan, Kai Kadau, William Clarence McLendon, Franz Franchetti, Peter Williams, Andy Yoo, Keith Henderson, Bob Walkup, Bruce Hendrickson, Timothy C. Germann, George Almási, Christoph Überhuber, Erik W. Draeger, James C. Sexton, John A. Gunnels, Andrew W. Cook, Edmond Chow, Stefan Kral, Frederick H. Streitz, Vasily V. Bulatov, Jeffrey Greenough, Gyan Bhanot, Steve Louis, C. A. Rendleman, Manish Gupta, Charles J. Archer, Michael Welcome, Jürgen Lorenz, Francois Gygi, William H. Cabot, Bronis R. de Supinski, Alison Kubota, Peter S. Lomdahl, Brian J. Miller, Thomas E. Spelce, and James N. Glosli
- Subjects
TOP500 ,ComputerSystemsOrganization_COMPUTERSYSTEMIMPLEMENTATION ,Scale (ratio) ,Hardware and Architecture ,Computer science ,Parallelism (grammar) ,Code (cryptography) ,Parallel computing ,IBM ,Software ,Theoretical Computer Science - Abstract
BlueGene/L (BG/L), developed through a partnership between IBM and Lawrence Livermore National Laboratory (LLNL), is currently the world's largest system both in terms of scale, with 131,072 processors, and absolute performance, with a peak rate of 367 Tflop/s. BG/L has led the last four Top500 lists with a Linpack rate of 280.6 Tflop/s for the full machine installed at LLNL and is expected to remain the fastest computer in the next few editions. However, the real value of a machine such as BG/L derives from the scientific breakthroughs that real applications can produce by successfully using its unprecedented scale and computational power. In this paper, we describe our experiences with eight large scale applications on BG/ L from several application domains, ranging from molecular dynamics to dislocation dynamics and turbulence simulations to searches in semantic graphs. We also discuss the challenges we faced when scaling these codes and present several successful optimization techniques. All applications show excellent scaling behavior, even at very large processor counts, with one code even achieving a sustained performance of more than 100 Tflop/s, clearly demonstrating the real success of the BG/L design.
- Published
- 2008
- Full Text
- View/download PDF
26. Massively parallel quantum chromodynamics
- Author
-
A. Gara, James C. Sexton, Dong Chen, R. Soltz, Gyan Bhanot, P. Heidelberger, Pavlos Vranas, Valentina Salapura, Mark E. Giampapa, and Matthias A. Blumrich
- Subjects
Quantum chromodynamics ,Speedup ,ComputerSystemsOrganization_COMPUTERSYSTEMIMPLEMENTATION ,General Computer Science ,Discretization ,Computer science ,High Energy Physics::Lattice ,Lattice QCD ,Parallel computing ,Supercomputer ,Computational science ,Computer Science::Performance ,Lattice gauge theory ,Quantum field theory ,Computer Science::Operating Systems ,Massively parallel - Abstract
Quantum chromodynamics (QCD), the theory of the strong nuclear force, can be numerically simulated on massively parallel supercomputers using the method of lattice gauge theory. We describe the special programming requirements of lattice QCD (LQCD) as well as the optimal supercomputer hardware architectures for which LQCD suggests a need. We demonstrate these methods on the IBM Blue Gene/L™ (BG/L) massively parallel supercomputer and argue that the BG/L architecture is very well suited for LQCD studies. This suitability arises from the fact that LQCD is a regular lattice discretization of space into lattice sites, while the BG/L supercomputer is a discretization of space into compute nodes. Both LQCD and the BG/L architecture are constrained by the requirement of short-distance exchanges. This simple relation is technologically important and theoretically intriguing. We demonstrate a computational speedup of LQCD using up to 131,072 CPUs on the largest BG/L supercomputer available in 2007. As the number of CPUs is increased, the speedup increases linearly with sustained performance of about 20% of the maximum possible hardware speed. This corresponds to a maximum of 70.5 sustained teraflops. At these speeds, LQCD and the BG/L supercomputer are able to produce theoretical results for the next generation of strong-interaction physics.
- Published
- 2008
- Full Text
- View/download PDF
27. Simulating solidification in metals at high pressure: The drive to petascale computing
- Author
-
John A. Gunnels, James C. Sexton, Frederick H. Streitz, Bronis R. de Supinski, Mehul Patel, Robert K. Yates, James N. Glosli, and Bor Chan
- Subjects
Coalescence (physics) ,Physics ,History ,Mesoscopic physics ,Nucleation ,Spontaneous nucleation ,Mechanics ,Pressure dependence ,Computer Science Applications ,Education ,Petascale computing ,High pressure ,Statistical physics ,Scaling - Abstract
We investigate solidification in metal systems ranging in size from 64,000 to 524,288,000 atoms on the IBM BlueGene/L computer at LLNL. Using the newly developed ddcMD code, we achieve performance rates as high as 103 TFlops, with a performance of 101.7 TFlop sustained over a 7 hour run on 131,072 cpus. We demonstrate superb strong and weak scaling. Our calculations are significant as they represent the first atomic-scale model of metal solidification to proceed, without finite size effects, from spontaneous nucleation and growth of solid out of the liquid, through the coalescence phase, and into the onset of coarsening. Thus, our simulations represent the first step towards an atomistic model of nucleation and growth that can directly link atomistic to mesoscopic length scales.
- Published
- 2006
- Full Text
- View/download PDF
28. Optimizing task layout on the Blue Gene/L supercomputer
- Author
-
E. Lawless, Gyan Bhanot, Robert E. Walkup, James C. Sexton, Alan Gara, and P. Heidelberger
- Subjects
Discrete mathematics ,Matrix (mathematics) ,General Computer Science ,Markov chain ,Computer science ,Heuristic ,Domain (ring theory) ,Message Passing Interface ,Node (circuits) ,Torus ,Parallel computing ,Supercomputer - Abstract
A general method for optimizing problem layout on the Blue Gene®/L (BG/L) supercomputer is described. The method takes as input the communication matrix of an arbitrary problem as an array with entries C(i, j), which represents the data communicated from domain i to domain j. Given C(i, j), we implement a heuristic map that attempts to sequentially map a domain and its communication neighbors either to the same BG/L node or to near-neighbor nodes on the BG/L torus, while keeping the number of domains mapped to a BG/L node constant. We then generate a Markov chain of maps using Monte Carlo simulation with free energy F =Σi,j C(i, j)H(i, j), where H(i, j) is the smallest number of hops on the BG/L torus between domain i and domain j. For two large parallel applications, SAGE and UMT2000, the method was tested against the default Message Passing Interface rank order layout on up to 2,048 BG/L nodes. It produced maps that improved communication efficiency by up to 45%.
- Published
- 2005
- Full Text
- View/download PDF
29. Computational fluid dynamics modeling of the paddle dissolution apparatus: Agitation rate, mixing patterns, and fluid velocities
- Author
-
Geoff Bradley, James C. Sexton, L. G. McCarthy, Anne Marie Healy, and Owen I. Corrigan
- Subjects
Pharmaceutical Science ,Thermodynamics ,Aquatic Science ,Computational fluid dynamics ,Article ,Physical Phenomena ,Mixing patterns ,Drug Discovery ,Technology, Pharmaceutical ,Paddle ,Dissolution testing ,Dissolution ,Ecology, Evolution, Behavior and Systematics ,Ecology ,Chemistry ,business.industry ,Physics ,Rotational speed ,General Medicine ,Mechanics ,Models, Chemical ,Solubility ,Solid body ,business ,Agronomy and Crop Science ,Complete mixing - Abstract
The purpose of this research was to further investigate the hydrodynamics of the United States Pharmacopeia (USP) paddle dissolution apparatus using a previously generated computational fluid dynamics (CFD) model. The influence of paddle rotational speed on the hydrodynamics in the dissolution vessel was simulated. The maximum velocity magnitude for axial and tangential velocities at different locations in the vessel was found to increase linearly with the paddle rotational speed. Path-lines of fluid mixing, which were examined from a central region at the base of the vessel, did not reveal a region of poor mixing between the upper cylindrical and lower hemispherical volumes, as previously speculated. Considerable differences in the resulting flow patterns were observed for paddle rotational speeds between 25 and 150 rpm. The approximate time required to achieve complete mixing varied between 2 to 5 seconds at 150 rpm and 40 to 60 seconds at 25 rpm, although complete mixing was achievable for each speed examined. An analysis of CFD-generated velocities above the top surface of a cylindrical compact positioned at the base of the vessel, below the center of the rotating paddle, revealed that the fluid in this region was undergoing solid body rotation. An examination of the velocity boundary layers adjacent to the curved surface of the compact revealed large peaks in the shear rates for a region within ~3 mm from the base of the compact, consistent with a "grooving" effect, which had been previously seen on the surface of compacts following dissolution, associated with a higher dissolution rate in this region.
- Published
- 2004
- Full Text
- View/download PDF
30. Programming Abstractions for Data Locality
- Author
-
Brice Goglin, James C. Sexton, Thomas C. Schulthess, Adrian Tate, Peter Messmer, Emmanuel Jeannot, Hatem Ltaief, Maciej Besta, Amir Kamil, David Padua, Satoshi Matsuoka, Vitus J. Leung, Armin Groblinger, Anshu Dubey, Mark Abraham, Mauro Bianco, Torsten Hoefler, Romain Ciedat, Miquel Pericas, Naoya Maruyama, Leonidas Linardakis, Chris J. Newburn, Jesús Labarta, Kathryn M. O'Brien, John Shalf, Brad Chamberlain, Robert Ross, Gysi Tobias, Frank Hannig, Karl Fuerlinger, Paul H. J. Kelly, Didem Unat, Marie-Christine Sawley, Harold C. Edwards, Cray, Inc., Lawrence Berkeley National Laboratory [Berkeley] (LBNL), University of Passau, Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB), Efficient runtime systems for parallel architectures (RUNTIME), Inria Bordeaux - Sud-Ouest, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS), Sandia National Laboratories [Livermore], Sandia National Laboratories - Corporation, INTEL, Intel, Department of Computer Science [UIUC] (UIUC), University of Illinois at Urbana-Champaign [Urbana], University of Illinois System-University of Illinois System, Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), Department of Computer Science [ETH Zürich] (D-INFK), Eidgenössische Technische Hochschule - Swiss Federal Institute of Technology [Zürich] (ETH Zürich), King Abdullah University of Science and Technology (KAUST), IBM Almaden Research Center [San Jose], IBM, Barcelona Supercomputing Center - Centro Nacional de Supercomputacion (BSC - CNS), Ludwig-Maximilians-Universität München (LMU), Max Planck Institute for Meteorology (MPI-M), Max-Planck-Gesellschaft, Royal Institute of Technology [Stockholm] (KTH ), Swiss National Supercomputing Centre-SCR department (CSCS), Chalmers University of Technology [Göteborg], RIKEN - Institute of Physical and Chemical Research [Japon] (RIKEN), Imperial College London, NVIDIA (NVIDIA), Argonne National Laboratory [Lemont] (ANL), Tokyo Institute of Technology [Tokyo] (TITECH), PADAL Workshop 2014, April 28--29, Swiss National Supercomputing Center (CSCS), Lugano, Switzerland, and Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
[INFO.INFO-AR]Computer Science [cs]/Hardware Architecture [cs.AR] ,Theoretical computer science ,Computer science ,02 engineering and technology ,computer.software_genre ,Database-centric architecture ,Runtime system ,locality ,Software ,0202 electrical engineering, electronic engineering, information engineering ,Massively parallel ,020203 distributed computing ,data layout ,abstractions ,business.industry ,Locality ,[INFO.INFO-MO]Computer Science [cs]/Modeling and Simulation ,Code refactoring ,Programming paradigm ,020201 artificial intelligence & image processing ,affinity ,Compiler ,[INFO.INFO-OS]Computer Science [cs]/Operating Systems [cs.OS] ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,Software engineering ,business ,computer - Abstract
The goal of the workshop and this report is to identify common themes and standardize concepts for locality-preserving abstractions for exascale programming models. Current software tools are built on the premise that computing is the most expensive component, we are rapidly moving to an era that computing is cheap and massively parallel while data movement dominates energy and performance costs. In order to respond to exascale systems (the next generation of high performance computing systems), the scientific computing community needs to refactor their applications to align with the emerging data-centric paradigm. Our applications must be evolved to express information about data locality. Unfortunately current programming environments offer few ways to do so. They ignore the incurred cost of communication and simply rely on the hardware cache coherency to virtualize data movement. With the increasing importance of task-level parallelism on future systems, task models have to support constructs that express data locality and affinity. At the system level, communication libraries implicitly assume all the processing elements are equidistant to each other. In order to take advantage of emerging technologies, application developers need a set of programming abstractions to describe data locality for the new computing ecosystem. The new programming paradigm should be more data centric and allow to describe how to decompose and how to layout data in the memory.Fortunately, there are many emerging concepts such as constructs for tiling, data layout, array views, task and thread affinity, and topology aware communication libraries for managing data locality. There is an opportunity to identify commonalities in strategy to enable us to combine the best of these concepts to develop a comprehensive approach to expressing and managing data locality on exascale programming systems. These programming model abstractions can expose crucial information about data locality to the compiler and runtime system to enable performance-portable code. The research question is to identify the right level of abstraction, which includes techniques that range from template libraries all the way to completely new languages to achieve this goal.
- Published
- 2014
31. Geometric discretization scheme applied to the Abelian Chern-Simons theory
- Author
-
James C. Sexton, David H. Adams, Siddhartha Sen, Samik Sen, and School of Physical and Mathematical Sciences
- Subjects
High Energy Physics - Theory ,Pure mathematics ,High Energy Physics - Theory (hep-th) ,Discretization ,Differential form ,Lattice gauge theory ,Mathematical analysis ,Physical and Mathematical Sciences ,Chern–Simons theory ,Torsion (algebra) ,FOS: Physical sciences ,Abelian group ,Mathematics - Abstract
We give a detailed general description of a recent geometrical discretisation scheme and illustrate, by explicit numerical calculation, the scheme's ability to capture topological features. The scheme is applied to the Abelian Chern-Simons theory and leads, after a necessary field doubling, to an expression for the discrete partition function in terms of untwisted Reidemeister torsion and of various triangulation dependent factors. The discrete partition function is evaluated computationally for various triangulations of $S^3$ and of lens spaces. The results confirm that the discretisation scheme is triangulation independent and coincides with the continuum partition function, Comment: 27 pages, 5 figures, 6 tables. in latex
- Published
- 2000
- Full Text
- View/download PDF
32. Approximate actions for lattice QCD simulation
- Author
-
Alan C. Irving and James C. Sexton
- Subjects
Quantum chromodynamics ,Physics ,Nuclear and High Energy Physics ,Wilson loop ,High Energy Physics - Lattice (hep-lat) ,Lattice field theory ,FOS: Physical sciences ,Observable ,Lattice QCD ,High Energy Physics - Lattice ,Lattice gauge theory ,Quantum mechanics ,Lattice (order) ,Statistical physics ,Lattice model (physics) - Abstract
We describe a systematic approach to generating approximate actions for the lattice simulation of QCD. Three different tuning conditions are defined to match approximate with true actions, and it is shown that these three conditions become equivalent when the approximate and true actions are sufficiently close. We present a detailed study of approximate actions in the lattice Schwinger model together with an exploratory study of full QCD at unphysical parameter values. We find that the technicalities of the approximate action approach work quite well. However, very delicate tuning is necessary to find an approximate action which gives good predictions for all physical observables. Our best view of the immediate applicability of the methods we describe is to allow high statistics studies of particular physical observables after a low statistics full fermion simulation has been used to prepare the stage., Comment: 34 pages, 5 postscript figures, uses revtex and psfig
- Published
- 1997
- Full Text
- View/download PDF
33. QCD on the BlueGene/L Supercomputer
- Author
-
Alan Gara, James C. Sexton, Dong Chen, Pavlos M. Vranas, and Gyan Bhanot
- Subjects
Quantum chromodynamics ,Physics ,Nuclear and High Energy Physics ,Particle physics ,High Energy Physics - Lattice ,ComputerSystemsOrganization_COMPUTERSYSTEMIMPLEMENTATION ,High Energy Physics - Lattice (hep-lat) ,FOS: Physical sciences ,ComputerSystemsOrganization_PROCESSORARCHITECTURES ,IBM ,Supercomputer ,Atomic and Molecular Physics, and Optics - Abstract
In June 2004 QCD was simulated for the first time at sustained speed exceeding 1 TeraFlops in the BlueGene/L supercomputer at the IBM T.J. Watson Research Lab. The implementation and performance of QCD in the BlueGene/L is presented., Talk presented at Lattice2004(machines), Fermilab, 21-26 June 2004, 3 pages, 5 figures
- Published
- 2005
- Full Text
- View/download PDF
34. Computers for lattice simulation
- Author
-
James C. Sexton
- Subjects
Nuclear and High Energy Physics ,Computer science ,Computer graphics (images) ,Lattice (order) ,Atomic and Molecular Physics, and Optics - Abstract
High performance computers are reviewed. Specially built dedicated machines from APE, Columbia, CP-PACS, and RTNN are described. We also consider recent developments in the commercial computer world, and compare machine performances and costs per sustained Gigaflop.
- Published
- 1996
- Full Text
- View/download PDF
35. Unquenching the Schwinger model
- Author
-
James C. Sexton and Alan C. Irving
- Subjects
Physics ,Nuclear and High Energy Physics ,High Energy Physics::Lattice ,Quantum mechanics ,Lattice (order) ,Bound state ,Fermion ,Atomic and Molecular Physics, and Optics - Abstract
We study the quenched and unquenched lattice Schwinger model with Wilson fermions. The lowest non-trivial order of the systematic expansion recently proposed by Sexton and Weingarten is shown to allow good estimates of long distance physics from quenched configurations. Results for the static potential and the lowest bound state mass are presented.
- Published
- 1996
- Full Text
- View/download PDF
36. Numerical Evidence for the Observation of a Scalar Glueball
- Author
-
D. Weingarten, A. Vaccarino, and James C. Sexton
- Subjects
High Energy Physics - Theory ,Physics ,Quantum chromodynamics ,Particle physics ,Meson ,Glueball ,High Energy Physics::Lattice ,High Energy Physics - Lattice (hep-lat) ,Nuclear Theory ,High Energy Physics::Phenomenology ,Scalar (mathematics) ,Lattice field theory ,FOS: Physical sciences ,General Physics and Astronomy ,Lattice QCD ,Quarkonium ,Pseudoscalar ,High Energy Physics - Phenomenology ,High Energy Physics - Lattice ,High Energy Physics - Phenomenology (hep-ph) ,High Energy Physics - Theory (hep-th) ,High Energy Physics::Experiment - Abstract
We compute from lattice QCD in the valence (quenched) approximation the partial decay widths of the lightest scalar glueball to pairs of pseudoscalar quark-antiquark states. These predictions and values obtained earlier for the scalar glueball's mass are in good agreement with the observed properties of $f_J(1710)$ and inconsistent with all other observed meson resonances., Comment: 12 pages of Latex, 3 PostsScript figures as separate uufile
- Published
- 1995
- Full Text
- View/download PDF
37. HPCS 2012 keynotes: Tuesday keynote: Europe back in the HPC race: Building a European ecosystem to recover and maintain the capacity of designing and building large computers
- Author
-
Jesús Labarta, Felix Schürmann, James C. Sexton, and Jean Gonnord
- Subjects
business.industry ,Computer science ,European research ,Global vision ,computer.software_genre ,Field (computer science) ,World class ,Engineering management ,Software ,Workforce ,Operating system ,European commission ,business ,computer - Abstract
High Power Computing is unanimously recognized today as strategic for research, industry, society and defence. After a long absence, Europe has set up in 2008, as part of PCRD7, the PRACE project to give European research world class computing access. CEA, which has always been a major actor in this field, has a more global vision covering from the multiple usage of HPC to the development of the necessary technology. To support its strategy, CEA established a scientific computing complex, one of the largest in the world with two petaflop/s machines, set up a research platform on machine architectures, parallelism, and software environment shared with industrial and academic partners, and shows an ambitious roadmap to the exaflop/s. The last result of CEA strategy is the BULL TERA100 machine, the first petaflop/s computer ever designed and built in Europe. The TERA100 machine, followed by the Curie machine of the PRACE project, the HELIOS machine of the Fusion program, and a lot of industrial success are demonstrating the capability of Europe to come back on this challenging market. With this background the European Commission has called in 2012, as part of its HORIZON 2020 program, for an ambitious project: • Provide a world-class European HPC infrastructure, benefitting a broad range of academic and industry users, and especially SMEs, including a workforce well trained in HPC; • Ensure independent access to HPC technologies, systems and services for the EU. The talk will cover what has been done in Europe in this period and what is in preparation to answer the European Commission roadmap.
- Published
- 2012
- Full Text
- View/download PDF
38. HPCS 2012 panels: Panel I: Energy efficient systems in next generation high performance data and compute centers
- Author
-
Gorka Esnal Lopez, Jean-Marc Pierson, Jesus Carretero, Pascal Bouvry, Mads Nygård, Reinhard Schneider, James C. Sexton, Vicente Martin, Marco Passante, David R.C. Hill, Carsten Trinitis, Miguel A. Ordonez, Jesús Labarta, Giovanni Aloisio, Johnatan E. Pecero, Maria Mirto, Alexander Heinecke, Lamia Djoudi, Laurent Lefèvre, Laboratoire d'Informatique, de Modélisation et d'Optimisation des Systèmes (LIMOS), Ecole Nationale Supérieure des Mines de St Etienne-Centre National de la Recherche Scientifique (CNRS)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020]), Ecole Nationale Supérieure des Mines de St Etienne-Université Clermont Auvergne [2017-2020] (UCA [2017-2020])-Centre National de la Recherche Scientifique (CNRS), and Ecole Nationale Supérieure des Mines de St Etienne (ENSM ST-ETIENNE)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020])-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Computer science ,business.industry ,Scale (chemistry) ,Distributed computing ,Real-time computing ,Energy consumption ,7. Clean energy ,[INFO.INFO-MO]Computer Science [cs]/Modeling and Simulation ,Renewable energy ,Software ,13. Climate action ,Server ,business ,Energy (signal processing) ,Efficient energy use - Abstract
As large scale distributed systems gather and share more and more computing nodes and storage resources, their energy consumption is exponentially increasing. Next generation computing and data centers might require tens of MWatts to be feasible. Thus designing more efficient systems is a major challenge for computer engineers. This challenge is two-folds: saving money and being ecological by using renewable energy. The goal of this panel is to discuss current trends in energy use and energy costs of data centers and servers, metrics to benchmark computing and data centers energy consumption, and opportunities for reducing those costs through improved hardware and software.
- Published
- 2012
- Full Text
- View/download PDF
39. Meson decay constants from the valence approximation to lattice QCD
- Author
-
F. Butler, H. Chen, James C. Sexton, D. Weingarten, and A. Vaccarino
- Subjects
Quark ,Physics ,Nuclear and High Energy Physics ,Particle physics ,Valence (chemistry) ,Meson ,High Energy Physics::Lattice ,High Energy Physics - Lattice (hep-lat) ,Infinite volume ,FOS: Physical sciences ,Lattice QCD ,High Energy Physics - Lattice ,Lattice constant ,High Energy Physics::Experiment - Abstract
We evaluate $f_{\pi}/ m_{\rho}$, $f_K/ m_{\rho}$, $1/f_{\rho}$, and $ m_{\phi}/(f_{\phi} m_{\rho})$, extrapolated to physical quark mass, zero lattice spacing and infinite volume, for lattice QCD with Wilson quarks in the valence (quenched) approximation. The predicted ratios differ from experiment by amounts ranging from 12\% to 17\% equivalent to between 0.9 and 2.8 times the corresponding statistical uncertainties., Comment: uufiles encoded copy of 40 page Latex article, including 14 figures in Postscript. The long version of hep-lat/9302012, IBM/HET 93-3
- Published
- 1994
- Full Text
- View/download PDF
40. Hamiltonian evolution for the hybrid Monte Carlo algorithm
- Author
-
James C. Sexton and D. Weingarten
- Subjects
Quantum chromodynamics ,Physics ,Nuclear and High Energy Physics ,Quantum Monte Carlo ,High Energy Physics::Lattice ,Monte Carlo method ,Lattice QCD ,Atomic and Molecular Physics, and Optics ,Hybrid Monte Carlo ,symbols.namesake ,Dynamic Monte Carlo method ,symbols ,Monte Carlo integration ,Monte Carlo method in statistical physics ,Asymptotic formula ,Quasi-Monte Carlo method ,Statistical physics ,Vacuum polarization ,Quantum field theory ,Hamiltonian (quantum mechanics) ,Algorithm ,Computer Science::Databases ,Monte Carlo molecular modeling ,Mathematics - Abstract
We discuss a class of reversible, discrete approximations to Hamilton's equations for use in the hybrid Monte Carlo algorithm and derive an asymptotic formula for the step-size-dependent errors arising from this family of approximations. For lattice QCD with Wilson fermions, we construct several different updates in which the effect of fermion vacuum polarization is given a longer time step than the gauge field's self-interaction. On a 44 lattice, one of these algorithms with an optimal choice of step size is 30% to 40% faster than the standard leapfrog update with an optimal step size.
- Published
- 1992
- Full Text
- View/download PDF
41. PHYSICS GOALS OF THE QCD TERAFLOP PROJECT
- Author
-
Claudio Rebbi, Pietro Rossi, Robert G. Edwards, Jean Potvin, Sergio Sanielevici, Sinya Aoki, Carleton DeTar, Bernd A. Berg, Anthony D. Kennedy, Michael C. Ogilvie, John W. Negele, Walter Wilcox, Greg Kilcup, Edward Shuryak, D.K. Sinclair, Herbert W. Hamber, Don Petcher, Gyan Bhanot, Urs M. Heller, Amarjit Soni, S. Gottlieb, Claude Bernard, Keh-Fei Liu, James C. Sexton, Richard C. Brower, Norman H. Christ, Robert Shrock, Robert D. Mawhinney, Junko Shigemitsu, John B. Kogut, Khalil M. Bitar, Frank R. Brown, Terrence Draper, Shigemi Ohta, I-Hsiu Lee, and Andreas S. Kronfeld
- Subjects
Quantum chromodynamics ,Physics ,Particle physics ,Computational Theory and Mathematics ,General Physics and Astronomy ,Statistical and Nonlinear Physics ,Mathematical Physics ,Computer Science Applications - Published
- 1991
- Full Text
- View/download PDF
42. Asynchronous task dispatch for high throughput computing for the eServer IBM Blue Gene® Supercomputer
- Author
-
G. G. Stewart, Alan J. King, A. Peters, T. Budnik, James C. Sexton, Michael B. Mundy, P. Michaud, and P. McCarthy
- Subjects
Task (computing) ,Petascale computing ,Grid computing ,Asynchronous communication ,Computer science ,Scalability ,Operating system ,High-throughput computing ,computer.software_genre ,Supercomputer ,Throughput (business) ,computer - Abstract
High Throughput Computing (HTC) environments strive "to provide large amounts of processing capacity to customers over long periods of time by exploiting existing resources on the network" according to Basney and Livny [1]. A single Blue Gene/L rack can provide thousands of CPU resources into HTC environments. This paper discusses the implementation of an asynchronous task dispatch system that exploits a recently released feature of the Blue Gene/L control system - called HTC mode - and presents data on experimental runs consisting of the asynchronous submission of multiple batches of thousands of tasks for financial workloads. The methodology developed here demonstrates how systems with very large processor counts and light-weight kernels can be configured to deliver capacity computing at the individual processor level in future petascale computing systems.
- Published
- 2008
- Full Text
- View/download PDF
43. Next-Generation Performance Counters: Towards Monitoring Over Thousand Concurrent Events
- Author
-
James C. Sexton, Robert E. Walkup, Michael K. Gschwind, Valentina Salapura, A. Gara, and Karthik Ganesan
- Subjects
Event (computing) ,Computer science ,business.industry ,Performance tuning ,System monitoring ,Supercomputer ,Application software ,computer.software_genre ,Concurrency control ,Embedded system ,Static random-access memory ,business ,Protocol (object-oriented programming) ,computer - Abstract
We present a novel performance monitor architecture, implemented in the Blue Gene/PTM supercomputer. This performance monitor supports the tracking of a large number of concurrent events by using a hybrid counter architecture. The counters have their low order data implemented in registers which are concurrently updated, while the high order counter data is maintained in a dense SRAM array that is updated from the registers on a regular basis. The per formance monitoring architecture includes support for per- event thresholding and fast event notification, using a two- phase interrupt-arming and triggering protocol. A first implementation provides 256 concurrent 64b counters which offers an up to 64x increase in counter number compared to performance monitors typically found in microprocessors today, and thereby dramatically expands the capabilities of counter-based performance tuning.
- Published
- 2008
- Full Text
- View/download PDF
44. Minimal Data Copy for Dense Linear Algebra Factorization
- Author
-
John A. Gunnels, James C. Sexton, and Fred G. Gustavson
- Subjects
Matrix (mathematics) ,Factorization ,Fortran ,Computer science ,Linear algebra ,Block matrix ,Parallel computing ,Data structure ,computer ,Triangular array ,Basic Linear Algebra Subprograms ,Cholesky decomposition ,computer.programming_language - Abstract
The full format data structures of Dense Linear Algebra hurt the performance of its factorization algorithms. Full format rectangular matrices are the input and output of level the 3 BLAS. It follows that the LAPACK and Level 3 BLAS approach has a basic performance flaw. We describe a new result that shows that representing a matrix A as a collection of square blocks will reduce the amount of data reformating required by dense linear algebra factorization algorithms from O(n3) to O(n2). On an IBM Power3 processor our implementation of Cholesky factorization achieves 92% of peak performance whereas conventional full format LAPACK DPOTRF achieves 77% of peak performance. All programming for our new data structures may be accomplished in standard Fortran, through the use of higher dimensional full format arrays. Thus, new compiler support may not be necessary. We also discuss the role of concatenating submatrices to facilitate hardware streaming. Finally, we discuss a new concept which we call the L1 / L0 cache interface.
- Published
- 2007
- Full Text
- View/download PDF
45. A lattice calculation of the R-Torsion for U(1) Chern-Simons theory
- Author
-
James C. Sexton, David H. Adams, Samik Sen, and Siddhartha Sen
- Subjects
Lens (geometry) ,Nuclear and High Energy Physics ,Lattice (module) ,Discretization ,Mathematical analysis ,Chern–Simons theory ,Lens space ,Torsion (algebra) ,Order (ring theory) ,U-1 ,Atomic and Molecular Physics, and Optics ,Mathematics ,Mathematical physics - Abstract
We describe a new geometrical discretisation method which exactly preserved topological properties of U(1) Chern-Simons theories. A numerical calculation using these new methods correctly evaluates the Reidemeister Torsion for low order Lens spaces.
- Published
- 1998
- Full Text
- View/download PDF
46. Methods for Calibration of Prout-Tompkins Kinetics Parameters Using EZM Iteration and GLO
- Author
-
A K Burnham, B de Supinski, John A. Gunnels, James C. Sexton, and A P Wemhoff
- Subjects
Data point ,Chemistry ,Statistics ,Calibration ,Experimental data ,Application procedure ,Algorithm - Abstract
This document contains information regarding the standard procedures used to calibrate chemical kinetics parameters for the extended Prout-Tompkins model to match experimental data. Two methods for calibration are mentioned: EZM calibration and GLO calibration. EZM calibration matches kinetics parameters to three data points, while GLO calibration slightly adjusts kinetic parameters to match multiple points. Information is provided regarding the theoretical approach and application procedure for both of these calibration algorithms. It is recommended that for the calibration process, the user begin with EZM calibration to provide a good estimate, and then fine-tune the parameters using GLO. Two examples have been provided to guide the reader through a general calibrating process.
- Published
- 2006
- Full Text
- View/download PDF
47. Variations of a theme
- Author
-
Siddhartha Sen, James C. Sexton, and Ivo Sachs
- Subjects
Statistical ensemble ,Canonical ensemble ,Microcanonical ensemble ,Computer science ,Entropy (statistical thermodynamics) ,Open statistical ensemble ,Statistical mechanics ,Statistical physics ,Statistical weight ,Quantum statistical mechanics - Published
- 2006
- Full Text
- View/download PDF
48. Non-relativistic quantum field theory
- Author
-
James C. Sexton, Siddhartha Sen, and Ivo Sachs
- Subjects
Physics ,Open quantum system ,Quantization (physics) ,Classical mechanics ,Quantum gravity ,Supersymmetric quantum mechanics ,Quantum dissipation ,Quantum statistical mechanics ,Imaginary time ,Relationship between string theory and quantum field theory - Published
- 2006
- Full Text
- View/download PDF
49. The problem
- Author
-
Siddhartha Sen, James C. Sexton, and Ivo Sachs
- Subjects
Physics ,State variable ,Internal energy ,Entropy (statistical thermodynamics) ,media_common.quotation_subject ,Computational mechanics ,Calculus ,Second law of thermodynamics ,Statistical mechanics ,Statistical physics ,First law of thermodynamics ,Third law of thermodynamics ,media_common - Published
- 2006
- Full Text
- View/download PDF
50. Phase transitions and the renormalization group
- Author
-
James C. Sexton, Siddhartha Sen, and Ivo Sachs
- Subjects
Renormalization ,Density matrix renormalization group ,Quantum mechanics ,Functional renormalization group ,Statistical mechanics ,Renormalization group ,Critical exponent ,Critical dimension ,Mathematical physics ,Mathematics ,Universality (dynamical systems) - Published
- 2006
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.