233 results on '"Scott Klasky"'
Search Results
2. zMesh: Theories and Methods to Exploring Application Characteristics to Improve Lossy Compression Ratio for Adaptive Mesh Refinement
- Author
-
Huizhang Luo, Junqi Wang, Qing Liu, Jieyang Chen, Scott Klasky, and Norbert Podhorszki
- Subjects
Computational Theory and Mathematics ,Hardware and Architecture ,Signal Processing - Published
- 2022
- Full Text
- View/download PDF
3. MGARD+: Optimizing Multilevel Methods for Error-Bounded Scientific Data Reduction
- Author
-
Lipeng Wan, David Pugmire, Xin Liang, Matthew Wolf, Dingwen Tao, Jieyang Chen, James Kress, Scott Klasky, Qing Liu, Norbert Podhorszki, and Ben Whitney
- Subjects
FOS: Computer and information sciences ,Computer science ,Lossy compression ,Theoretical Computer Science ,Data modeling ,Reduction (complexity) ,Computer Science - Distributed, Parallel, and Cluster Computing ,Computational Theory and Mathematics ,Computer engineering ,Hardware and Architecture ,Compression ratio ,Decomposition (computer science) ,Distributed, Parallel, and Cluster Computing (cs.DC) ,Decomposition method (constraint satisfaction) ,Error detection and correction ,Software ,Data compression - Abstract
Data management is becoming increasingly important in dealing with the large amounts of data produced by large-scale scientific simulations and instruments. Existing multilevel compression algorithms offer a promising way to manage scientific data at scale, but may suffer from relatively low performance and reduction quality. In this paper, we propose MGARD+, a multilevel data reduction and refactoring framework drawing on previous multilevel methods, to achieve high-performance data decomposition and high-quality error-bounded lossy compression. Our contributions are four-fold: 1) We propose a level-wise coefficient quantization method, which uses different error tolerances to quantize the multilevel coefficients. 2) We propose an adaptive decomposition method which treats the multilevel decomposition as a preconditioner and terminates the decomposition process at an appropriate level. 3) We leverage a set of algorithmic optimization strategies to significantly improve the performance of multilevel decomposition/recomposition. 4) We evaluate our proposed method using four real-world scientific datasets and compare with several state-of-the-art lossy compressors. Experiments demonstrate that our optimizations improve the decomposition/recomposition performance of the existing multilevel method by up to 70X, and the proposed compression method can improve compression ratio by up to 2X compared with other state-of-the-art error-bounded lossy compressors under the same level of data distortion.
- Published
- 2022
- Full Text
- View/download PDF
4. Identifying challenges and opportunities of in-memory computing on large HPC systems
- Author
-
Dan Huang, Zhenlu Qin, Qing Liu, Norbert Podhorszki, and Scott Klasky
- Subjects
Artificial Intelligence ,Computer Networks and Communications ,Hardware and Architecture ,Software ,Theoretical Computer Science - Published
- 2022
- Full Text
- View/download PDF
5. An Algorithmic and Software Pipeline for Very Large Scale Scientific Data Compression with Error Guarantees
- Author
-
Tania Banerjee, Jong Choi, Jaemoon Lee, Qian Gong, Ruonan Wang, Scott Klasky, Anand Rangarajan, and Sanjay Ranka
- Published
- 2022
- Full Text
- View/download PDF
6. Hybrid Analysis of Fusion Data for Online Understanding of Complex Science on Extreme Scale Computers
- Author
-
Eric Suchyta, Jong Youl Choi, Seung-Hoe Ku, David Pugmire, Ana Gainaru, Kevin Huck, Ralph Kube, Aaron Scheinberg, Frederic Suter, Choongseock Chang, Todd Munson, Norbert Podhorszki, and Scott Klasky
- Published
- 2022
- Full Text
- View/download PDF
7. Online data analysis and reduction: An important Co-design motif for extreme-scale computers
- Author
-
Todd Munson, Ian Foster, Shinjae Yoo, Hubertus J. J. van Dam, Igor Yakushin, Zichao Di, Line Pouchard, Manish Parashar, Kerstin Kleese van Dam, Ali Murat Gok, Kevin Huck, Xin Liang, Ozan Tugluk, Lipeng Wan, Justin M. Wozniak, Wei Xu, Kshitij Mehta, Jong Choi, Matthew Wolf, Mark Ainsworth, Julie Bessac, Franck Cappello, Sheng Di, Tom Peterka, Hanqi Guo, Scott Klasky, Christopher Kelly, and Tong Shu
- Subjects
Co-design ,Computer science ,Computation ,020207 software engineering ,010103 numerical & computational mathematics ,02 engineering and technology ,Supercomputer ,01 natural sciences ,Exascale computing ,Theoretical Computer Science ,Computational science ,Reduction (complexity) ,Motif (narrative) ,Hardware and Architecture ,Extreme scale ,0202 electrical engineering, electronic engineering, information engineering ,0101 mathematics ,Software - Abstract
A growing disparity between supercomputer computation speeds and I/O rates means that it is rapidly becoming infeasible to analyze supercomputer application output only after that output has been written to a file system. Instead, data-generating applications must run concurrently with data reduction and/or analysis operations, with which they exchange information via high-speed methods such as interprocess communications. The resulting parallel computing motif, online data analysis and reduction (ODAR), has important implications for both application and HPC systems design. Here we introduce the ODAR motif and its co-design concerns, describe a co-design process for identifying and addressing those concerns, present tools that assist in the co-design process, and present case studies to illustrate the use of the process and tools in practical settings.
- Published
- 2021
- Full Text
- View/download PDF
8. The Exascale Framework for High Fidelity coupled Simulations (EFFIS): Enabling whole device modeling in fusion science
- Author
-
Shuangxi Zhang, Berk Geveci, Matthew Wolf, Kevin Huck, E. Suchyta, Cameron W. Smith, Ruonan Wang, Stephane Ethier, Philip E. Davis, Manish Parashar, Pradeep Subedi, Gabriele Merlo, Abolaji Adesoji, Norbert Podhorszki, Qing Liu, Todd Munson, Shirley Moore, Mark S. Shephard, C.S. Chang, Jeremy Logan, Jong Choi, Lipeng Wan, Kai Germaschewski, David Pugmire, Ian Foster, Scott Klasky, Kshitij Mehta, Chris Harris, and Julien Dominski
- Subjects
020203 distributed computing ,Fusion ,Computer science ,02 engineering and technology ,01 natural sciences ,Code coupling ,010305 fluids & plasmas ,Theoretical Computer Science ,Computational science ,High fidelity ,Workflow ,Hardware and Architecture ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Software - Abstract
We present the Exascale Framework for High Fidelity coupled Simulations (EFFIS), a workflow and code coupling framework developed as part of the Whole Device Modeling Application (WDMApp) in the Exascale Computing Project. EFFIS consists of a library, command line utilities, and a collection of run-time daemons. Together, these software products enable users to easily compose and execute workflows that include: strong or weak coupling, in situ (or offline) analysis/visualization/monitoring, command-and-control actions, remote dashboard integration, and more. We describe WDMApp physics coupling cases and computer science requirements that motivate the design of the EFFIS framework. Furthermore, we explain the essential enabling technology that EFFIS leverages: ADIOS for performant data movement, PerfStubs/TAU for performance monitoring, and an advanced COUPLER for transforming coupling data from its native format to the representation needed by another application. Finally, we demonstrate EFFIS using coupled multi-simulation WDMApp workflows and exemplify how the framework supports the project’s needs. We show that EFFIS and its associated services for data movement, visualization, and performance collection does not introduce appreciable overhead to the WDMApp workflow and that the resource-dominant application’s idle time while waiting for data is minimal.
- Published
- 2021
- Full Text
- View/download PDF
9. Exploring Large All-Flash Storage System with Scientific Simulation
- Author
-
Junmin Gu, Greg Eisenhauer, Scott Klasky, Norbert Podhorszki, Ruonan Wang, and Kesheng Wu
- Published
- 2022
- Full Text
- View/download PDF
10. Region-adaptive, Error-controlled Scientific Data Compression using Multilevel Decomposition
- Author
-
Qian Gong, Ben Whitney, Chengzhu Zhang, Xin Liang, Anand Rangarajan, Jieyang Chen, Lipeng Wan, Paul Ullrich, Qing Liu, Robert Jacob, Sanjay Ranka, and Scott Klasky
- Published
- 2022
- Full Text
- View/download PDF
11. Error-Bounded Learned Scientific Data Compression with Preservation of Derived Quantities
- Author
-
Jaemoon Lee, Qian Gong, Jong Choi, Tania Banerjee, Scott Klasky, Sanjay Ranka, and Anand Rangarajan
- Subjects
Fluid Flow and Transfer Processes ,Process Chemistry and Technology ,General Engineering ,General Materials Science ,data compression ,autoencoders ,error guarantees ,moment preservation ,constraint satisfaction ,quantization ,fusion application ,Instrumentation ,Computer Science Applications - Abstract
Scientific applications continue to grow and produce extremely large amounts of data, which require efficient compression algorithms for long-term storage. Compression errors in scientific applications can have a deleterious impact on downstream processing. Thus, it is crucial to preserve all the “known” Quantities of Interest (QoI) during compression. To address this issue, most existing approaches guarantee the reconstruction error of the original data or primary data (PD), but cannot directly control the problem of preserving the QoI. In this work, we propose a physics-informed compression technique that is composed of two parts: (i) reduction of the PD with bounded errors and (ii) preservation of the QoI. In the first step, we combine tensor decompositions, autoencoders, product quantizers, and error-bounded lossy compressors to bound the reconstruction error at high levels of compression. In the second step, we use constraint satisfaction post-processing followed by quantization to preserve the QoI. To illustrate the challenges of reducing the reconstruction errors of the PD and QoI, we focus on simulation data generated by a large-scale fusion code, XGC, which can produce tens of petabytes in a single day. The results show that our approach can achieve a high compression amount while accurately preserving the QoI within scientifically acceptable bounds.
- Published
- 2022
- Full Text
- View/download PDF
12. P-ckpt: Coordinated Prioritized Checkpointing
- Author
-
Subhendu Behera, Lipeng Wan, Frank Mueller, Matthew Wolf, and Scott Klasky
- Published
- 2022
- Full Text
- View/download PDF
13. Improving I/O Performance for Exascale Applications Through Online Data Layout Reorganization
- Author
-
Ruonan Wang, Lipeng Wan, Jean-Luc Vay, Scott Klasky, Jieyang Chen, Ian Foster, Todd Munson, Dmitry Ganyushin, Axel Huebl, Ana Gainaru, Xin Liang, Kesheng Wu, Junmin Gu, Norbert Podhorszki, and Franz Poeschel
- Subjects
Large class ,FOS: Computer and information sciences ,Optimization ,Distributed databases ,Computer science ,Layout ,media_common.quotation_subject ,Fidelity ,IO performance ,data access optimization ,computer.software_genre ,Computer Software ,Heuristic algorithms ,Arrays ,Auxiliary memory ,media_common ,File system ,data layout ,Communications Technologies ,WarpX ,Distributed database ,Data layout ,Dynamic data ,Computational modeling ,Parallel IO ,data layout IO ,Exascale computing ,Computational Theory and Mathematics ,Computer architecture ,Computer Science - Distributed, Parallel, and Cluster Computing ,Hardware and Architecture ,Signal Processing ,Performance evaluation ,Distributed, Parallel, and Cluster Computing (cs.DC) ,Distributed Computing ,computer - Abstract
The applications being developed within the U.S. Exascale Computing Project (ECP) to run on imminent Exascale computers will generate scientific results with unprecedented fidelity and record turn-around time. Many of these codes are based on particle-mesh methods and use advanced algorithms, especially dynamic load-balancing and mesh-refinement, to achieve high performance on Exascale machines. Yet, as such algorithms improve parallel application efficiency, they raise new challenges for I/O logic due to their irregular and dynamic data distributions. Thus, while the enormous data rates of Exascale simulations already challenge existing file system write strategies, the need for efficient read and processing of generated data introduces additional constraints on the data layout strategies that can be used when writing data to secondary storage. We review these I/O challenges and introduce two online data layout reorganization approaches for achieving good tradeoffs between read and write performance. We demonstrate the benefits of using these two approaches for the ECP particle-in-cell simulation WarpX, which serves as a motif for a large class of important Exascale applications. We show that by understanding application I/O patterns and carefully designing data layouts we can increase read performance by more than 80%., 12 pages, 15 figures, accepted by IEEE Transactions on Parallel and Distributed Systems
- Published
- 2022
14. The Adaptable IO System (ADIOS)
- Author
-
David Pugmire, Norbert Podhorszki, Scott Klasky, Matthew Wolf, James Kress, Mark Kim, Nicholas Thompson, Jeremy Logan, Ruonan Wang, Kshitij Mehta, Eric Suchyta, William Godoy, Jong Choi, George Ostrouchov, Lipeng Wan, Jieyang Chen, Berk Geveci Chuck Atkins, Caitlin Ross, Greg Eisenhauer, Junmin Gu, John Wu, Axel Huebl, and Seiji Tsutsumi
- Published
- 2022
- Full Text
- View/download PDF
15. The Need for Pervasive In Situ Analysis and Visualization (P-ISAV)
- Author
-
David Pugmire, Jian Huang, Kenneth Moreland, and Scott Klasky
- Published
- 2022
- Full Text
- View/download PDF
16. Maintaining Trust in Reduction: Preserving the Accuracy of Quantities of Interest for Lossy Compression
- Author
-
Qian Gong, Xin Liang, Ben Whitney, Jong Youl Choi, Jieyang Chen, Lipeng Wan, Stéphane Ethier, Seung-Hoe Ku, R. Michael Churchill, C. -S. Chang, Mark Ainsworth, Ozan Tugluk, Todd Munson, David Pugmire, Richard Archibald, and Scott Klasky
- Published
- 2022
- Full Text
- View/download PDF
17. Error-controlled, progressive, and adaptable retrieval of scientific data with multilevel decomposition
- Author
-
Lipeng Wan, Jieyang Chen, Qian Gong, Norbert Podhorszki, Scott Klasky, Qing Liu, Ben Whitney, Rick Archibald, David Pugmire, and Xin Liang
- Subjects
business.industry ,Computer science ,Reading (computer) ,computer.software_genre ,Code refactoring ,Data retrieval ,Computer data storage ,Range (statistics) ,Overhead (computing) ,Data mining ,business ,Error detection and correction ,computer ,Data compression - Abstract
Extreme-scale simulations and high-resolution instruments have been generating an increasing amount of data, which poses significant challenges to not only data storage during the run, but also post-processing where data will be repeatedly retrieved and analyzed for a long period of time. The challenges in satisfying a wide range of post-hoc analysis needs while minimizing the I/O overhead caused by inappropriate and/or excessive data retrieval should never be left unmanaged. In this paper, we propose a data refactoring, compressing, and retrieval framework capable of 1) fine-grained data refactoring with regard to precision; 2) incrementally retrieving and recomposing the data in terms of various error bounds; and 3) adaptively retrieving data in multi-precision and multi-resolution with respect to different analysis. With the progressive data re-composition and the adaptable retrieval algorithms, our framework significantly reduces the amount of data retrieved when multiple incremental precision are requested and/or the downstream analysis time when coarse resolution is used. Experiments show that the amount of data retrieved under the same progressively requested error bound using our framework is 64% less than that using state-of-the-art single-error-bounded approaches. Parallel experiments with up to 1, 024 cores and ~ 600 GB data in total show that our approach yields 1.36× and 2.52× performance over existing approaches in writing to and reading from persistent storage systems, respectively.
- Published
- 2021
- Full Text
- View/download PDF
18. Unbalanced Parallel I/O: An Often-Neglected Side Effect of Lossy Scientific Data Compression
- Author
-
Xinying Wang, Lipeng Wan, Jieyang Chen, Qian Gong, Ben Whitney, Jinzhen Wang, Ana Gainaru, Qing Liu, Norbert Podhorszki, Dongfang Zhao, Feng Yan, and Scott Klasky
- Published
- 2021
- Full Text
- View/download PDF
19. A codesign framework for online data analysis and reduction
- Author
-
Keichi Takahashi, Igor Yakushin, Kevin Huck, Swati Singhal, Bryce Allen, Jeremy Logan, Todd Munson, Kshitij Mehta, Alan Sussman, E. Suchyta, Jong Youl Choi, Matthew Wolf, Ian Foster, and Scott Klasky
- Subjects
Reduction (complexity) ,Workflow ,Computational Theory and Mathematics ,Computer Networks and Communications ,business.industry ,Computer science ,Embedded system ,business ,Software ,Computer Science Applications ,Theoretical Computer Science - Published
- 2021
- Full Text
- View/download PDF
20. Multilevel Techniques for Compression and Reduction of Scientific Data---The Unstructured Case
- Author
-
Mark Ainsworth, Ozan Tugluk, Ben Whitney, and Scott Klasky
- Subjects
Reduction (complexity) ,Computational Mathematics ,Applied Mathematics ,Compression (functional analysis) ,Polygon mesh ,Unstructured data ,010103 numerical & computational mathematics ,0101 mathematics ,Lossy compression ,01 natural sciences ,Mathematics ,Computational science ,Data reduction - Abstract
Previous work on multilevel techniques for compression and reduction of scientific data is extended to the case of data given on unstructured meshes in two and three dimensions. The centerpiece of ...
- Published
- 2020
- Full Text
- View/download PDF
21. Estimating Lossy Compressibility of Scientific Data Using Deep Neural Networks
- Author
-
Jieyang Chen, Jinzhen Wang, Scott Klasky, Dave Pugmire, Qing Liu, Norbert Podhorszki, and Zhenlu Qin
- Subjects
Artificial neural network ,business.industry ,Computer science ,Deep learning ,Sampling (statistics) ,Lossy compression ,computer.software_genre ,Supercomputer ,Overhead (computing) ,Data mining ,Artificial intelligence ,business ,computer ,Volume (compression) ,Data compression - Abstract
Simulation based scientific applications generate increasingly large amounts of data on high-performance computing (HPC) systems. To allow data to be stored and analyzed efficiently, data compression is often utilized to reduce the volume and velocity of data. However, a question often raised by domain scientists is the level of compression that can be expected so that they can make more informed decisions, balancing between accuracy and performance. In this letter, we propose a deep neural network based approach for estimating the compressibility of scientific data. To train the neural network, we build both general features as well as compressor-specific features so that the characteristics of both data and lossy compressors are captured in training. Our approach is demonstrated to outperform a prior analytical model as well as a sampling based approach in the case of a biased estimation, i.e., for SZ. However, for the unbiased estimation (i.e., ZFP), the sampling based approach yields the best accuracy, despite the high overhead involved in sampling the target dataset.
- Published
- 2020
- Full Text
- View/download PDF
22. Characterizing Output Bottlenecks of a Production Supercomputer
- Author
-
David A. Dillow, Scott Klasky, Bing Xie, Jong Youl Choi, Christopher Zimmer, Sarp Oral, Jay Lofstead, Norbert Podhorszki, and Jeffrey S. Chase
- Subjects
File system ,020203 distributed computing ,Statistical benchmarking ,business.industry ,Computer science ,020206 networking & telecommunications ,02 engineering and technology ,Benchmarking ,Parallel computing ,computer.software_genre ,Supercomputer ,Software ,Titan (supercomputer) ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Lustre (file system) ,Data striping ,business ,computer - Abstract
This article studies the I/O write behaviors of the Titan supercomputer and its Lustre parallel file stores under production load. The results can inform the design, deployment, and configuration of file systems along with the design of I/O software in the application, operating system, and adaptive I/O libraries. We propose a statistical benchmarking methodology to measure write performance across I/O configurations, hardware settings, and system conditions. Moreover, we introduce two relative measures to quantify the write-performance behaviors of hardware components under production load. In addition to designing experiments and benchmarking on Titan, we verify the experimental results on one real application and one real application I/O kernel, XGC and HACC IO, respectively. These two are representative and widely used to address the typical I/O behaviors of applications. In summary, we find that Titan’s I/O system is variable across the machine at fine time scales. This variability has two major implications. First, stragglers lessen the benefit of coupled I/O parallelism (striping). Peak median output bandwidths are obtained with parallel writes to many independent files, with no striping or write sharing of files across clients (compute nodes). I/O parallelism is most effective when the application—or its I/O libraries—distributes the I/O load so that each target stores files for multiple clients and each client writes files on multiple targets in a balanced way with minimal contention. Second, our results suggest that the potential benefit of dynamic adaptation is limited. In particular, it is not fruitful to attempt to identify “good locations” in the machine or in the file system: component performance is driven by transient load conditions and past performance is not a useful predictor of future performance. For example, we do not observe diurnal load patterns that are predictable.
- Published
- 2019
- Full Text
- View/download PDF
23. Can I/O Variability Be Reduced on QoS-Less HPC Storage Systems?
- Author
-
Norbert Podhorszki, Jeremy Logan, George Ostrouchov, Qing Liu, Jong Choi, Xubin He, Matthew Wolf, Dan Huang, and Scott Klasky
- Subjects
Input/output ,business.industry ,Computer science ,Quality of service ,Distributed computing ,02 engineering and technology ,Bandwidth throttling ,020202 computer hardware & architecture ,Theoretical Computer Science ,Computational Theory and Mathematics ,Hardware and Architecture ,Computer data storage ,0202 electrical engineering, electronic engineering, information engineering ,business ,Software - Abstract
For a production high-performance computing (HPC) system, where storage devices are shared between multiple applications and managed in a best effort manner, I/O contention is often a major problem. In this paper, we propose a balanced messaging-based re-routing in conjunction with throttling at the middleware level. This work tackles two key challenges that have not been fully resolved in the past: whether I/O variability can be reduced on a QoS-less HPC storage system, and how to design a runtime scheduling system that can scale up to a large amount of cores. The proposed scheme uses a two-level messaging system to re-route I/O requests to a less congested storage location so that write performance is improved, while limiting the impact on read by throttling re-routing. An analytical model is derived to guide the setup of optimal throttling factor. We thoroughly analyze the virtual messaging layer overhead and explore whether the in-transit buffering is effective in managing I/O variability. Contrary to the intuition, in-transit buffer cannot completely solve the problem. It can reduce the absolute variability but not the relative variability. The proposed scheme is verified against a synthetic benchmark as well as being used by production applications.
- Published
- 2019
- Full Text
- View/download PDF
24. Harnessing Data Movement in Virtual Clusters for In-Situ Execution
- Author
-
Dan Huang, Qing Liu, Jun Wang, Norbert Podhorszki, Jeremy Logan, Scott Klasky, and Jong Youl Choi
- Subjects
020203 distributed computing ,Computer science ,business.industry ,Distributed computing ,Big data ,Network virtualization ,Provisioning ,02 engineering and technology ,Virtualization ,computer.software_genre ,Bottleneck ,Data modeling ,Computational Theory and Mathematics ,Hardware and Architecture ,Asynchronous communication ,Analytics ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,business ,computer ,Virtual network - Abstract
As a result of increasing data volume and velocity, Big Data science at exascale has shifted towards the in-situ paradigm, where large scale simulations run concurrently alongside data analytics. With in-situ, data generated from simulations can be processed while still in memory, thereby avoiding the slow storage bottleneck. However, running simulations and analytics together on shared resources will likely result in substantial contention if left unmanaged, as demonstrated in this work, leading to much reduced efficiency of simulations and analytics. Recently, virtualization technologies such as Linux containers have been widely applied to data centers and physical clusters to provide highly efficient and elastic resource provisioning for consolidated workloads including scientific simulations and data analytics. In this paper, we investigate to facilitate network traffic manipulation and reduce mutual interference on the network for in-situ applications in virtual clusters. In order to dynamically allocate the network bandwidth when it is needed, we adopt SARIMA-based techniques to analyze and predict MPI traffic issued from simulations. Although this can be an effective technique, the naive usage of network virtualization can lead to performance degradation for bursty asynchronous transmissions within an MPI job. We analyze and resolve this performance degradation in virtual clusters.
- Published
- 2019
- Full Text
- View/download PDF
25. Transitioning from file-based HPC workflows to streaming data pipelines with openPMD and ADIOS2
- Author
-
Franz Poeschel, Juncheng E, William F. Godoy, Norbert Podhorszki, Scott Klasky, Greg Eisenhauer, Philip E. Davis, Lipeng Wan, Ana Gainaru, Junmin Gu, Fabian Koller, René Widera, Michael Bussmann, and Axel Huebl
- Subjects
high performance computing ,FOS: Computer and information sciences ,openPMD ,ADIOS ,Computer Science - Distributed, Parallel, and Cluster Computing ,big data ,RDMA ,streaming ,Distributed, Parallel, and Cluster Computing (cs.DC) - Abstract
This paper aims to create a transition path from file-based IO to streaming-based workflows for scientific applications in an HPC environment. By using the openPMP-api, traditional workflows limited by filesystem bottlenecks can be overcome and flexibly extended for in situ analysis. The openPMD-api is a library for the description of scientific data according to the Open Standard for Particle-Mesh Data (openPMD). Its approach towards recent challenges posed by hardware heterogeneity lies in the decoupling of data description in domain sciences, such as plasma physics simulations, from concrete implementations in hardware and IO. The streaming backend is provided by the ADIOS2 framework, developed at Oak Ridge National Laboratory. This paper surveys two openPMD-based loosely-coupled setups to demonstrate flexible applicability and to evaluate performance. In loose coupling, as opposed to tight coupling, two (or more) applications are executed separately, e.g. in individual MPI contexts, yet cooperate by exchanging data. This way, a streaming-based workflow allows for standalone codes instead of tightly-coupled plugins, using a unified streaming-aware API and leveraging high-speed communication infrastructure available in modern compute clusters for massive data exchange. We determine new challenges in resource allocation and in the need of strategies for a flexible data distribution, demonstrating their influence on efficiency and scaling on the Summit compute system. The presented setups show the potential for a more flexible use of compute resources brought by streaming IO as well as the ability to increase throughput by avoiding filesystem bottlenecks., 18 pages, 9 figures, SMC2021, supplementary material at https://zenodo.org/record/4906276
- Published
- 2021
26. zMesh: Exploring Application Characteristics to Improve Lossy Compression Ratio for Adaptive Mesh Refinement
- Author
-
Huizhang Luo, Junqi Wang, Norbert Podhorszki, Scott Klasky, Jieyang Chen, and Qing Liu
- Subjects
Tree (data structure) ,Tree structure ,Redundancy (information theory) ,Adaptive mesh refinement ,Computer science ,Compression ratio ,Overhead (computing) ,Data_CODINGANDINFORMATIONTHEORY ,Lossy compression ,Algorithm ,Volume (compression) - Abstract
Scientific simulations on high-performance computing systems produce vast amounts of data that need to be stored and analyzed efficiently. Lossy compression significantly reduces the data volume by trading accuracy for performance. Despite the recent success of lossy compression, such as ZFP and SZ, the compression performance is still far from being able to keep up with the exponential growth of data. This paper aims to further take advantage of application characteristics, an area that is often under-explored, to improve the compression ratios of adaptive mesh refinement (AMR) - a widely used numerical solver that allows for an improved resolution in limited regions. We propose a level reordering technique zMesh to reduce the storage footprint of AMR applications. In particular, we group the data points that are mapped to the same or adjacent geometric coordinates such that the dataset is smoother and more compressible. Unlike the prior work where the compression performance is affected by the overhead of metadata, this work re-generates restore recipe using a chained tree structure, thus involving no extra storage overhead for compressed data, which substantially improves the compression ratios. The results demonstrate that zMesh can improve the smoothness of data by 67.9% and 71.3% for Z-ordering and Hilbert, respectively. Overall, zMesh improves the compression ratios by up to 16.5% and 133.7% for ZFP and SZ, respectively. Despite that zMesh involves additional compute overhead for tree and restore recipe construction, we show that the cost can be amortized as the number of quantities to be compressed increases.
- Published
- 2021
- Full Text
- View/download PDF
27. A Framework for International Collaboration on ITER Using Large-Scale Data Transfer to Enable Near-Real-Time Analysis
- Author
-
Seung-Hoe Ku, Hyeon K. Park, C.S. Chang, Jong Choi, Ralph Kube, Robert Hager, T. Carroll, S. Kampel, Ruonan Wang, M. J. Choi, K. Silber, Scott Klasky, Eli Dart, B. S. Cho, Randy Michael Churchill, Matthew Wolf, and J. S. Park
- Subjects
Nuclear and High Energy Physics ,Energy ,Test data generation ,Computer science ,020209 energy ,Mechanical Engineering ,Real-time computing ,Molecular ,02 engineering and technology ,Large scale data ,01 natural sciences ,Atomic ,010305 fluids & plasmas ,Particle and Plasma Physics ,Nuclear Energy and Engineering ,Networking and Information Technology R&D (NITRD) ,Transfer (computing) ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,Nuclear ,Civil and Structural Engineering - Abstract
The global nature of the ITER project along with its projected approximately petabyte-per-day data generation presents not only a unique challenge but also an opportunity for the fusion community to rethink, optimize, and enhance our scientific discovery process. Recognizing this, collaborative research with computational scientists was undertaken over the past several years to create a framework for large-scale data movement across wide-area networks to enable global near-real-time analysis of fusion data. This would broaden the available computational resources for analysis/simulation and increase the number of researchers actively participating in experiments. An official demonstration of this framework for fast, large data transfer and real-time analysis was carried out between the KSTAR tokamak in Daejeon, Korea, and Princeton Plasma Physics Laboratory (PPPL) in Princeton, New Jersey. Streaming large data transfer, with near-real-time movie creation and analysis of the KSTAR electron cyclotron emission imaging data, was performed using the Adaptable Input Output (I/O) System (ADIOS) framework, and comparisons were made at PPPL with simulation results from the XGC1 code. These demonstrations were made possible utilizing an optimized network configuration at PPPL, which achieved over 8.8 Gbps (88% utilization) in throughput tests from the National Fusion Research Institute to PPPL. This demonstration showed the feasibility for large-scale data analysis of KSTAR data and provides a nascent framework to enable use of globally distributed computational and personnel resources in pursuit of scientific knowledge from the ITER experiment.
- Published
- 2021
28. Fides: A General Purpose Data Model Library for Streaming Data
- Author
-
David Pugmire, Caitlin Ross, Nicholas Thompson, James Kress, Chuck Atkins, Scott Klasky, and Berk Geveci
- Published
- 2021
- Full Text
- View/download PDF
29. Near real-time analysis of big fusion data on HPC systems
- Author
-
Minjun Choi, C.S. Chang, Ruonan Wang, Jong Choi, Ralph Kube, Scott Klasky, and R. Michael Churchill
- Subjects
business.industry ,Computer science ,Real-time computing ,Big data ,Cloud computing ,Virtualization ,computer.software_genre ,Visualization ,Workflow ,Data visualization ,Parallel processing (DSP implementation) ,Benchmark (computing) ,business ,computer - Abstract
We are developing the Delta framework that aims to tackle big-data challenges specific to fusion energy sciences. Delta can be used to connect fusion experiments to remote supercomputers. Streaming measurements to distributed compute resources allows to automatically perform high-dimensional data analysis on a cadence that exceeds experimental schedules. Making data analysis results available before the next experiments allows scientists to make more informed decisions about configuration of upcoming experiments. Here we describe how Delta uses database and virtualization facilities, as well as high-performance computing, at the National Energy Research Compute Center to offer a vertically integrated near real-time data analysis and visualization. We also report on ongoing efforts to port the data analysis part of Delta to graphical processing units, which show a reduction of the analysis wall-time for a benchmark workflow by about 35% when compared to a serial implementation.
- Published
- 2020
- Full Text
- View/download PDF
30. Taming I/O Variation on QoS-Less HPC Storage: What Can Applications Do?
- Author
-
Zhenbo Qiao, Qing Liu, Jieyang Chen, Norbert Podhorszki, and Scott Klasky
- Subjects
Input/output ,020203 distributed computing ,business.industry ,Computer science ,Distributed computing ,Quality of service ,020207 software engineering ,02 engineering and technology ,Supercomputer ,Plot (graphics) ,Bottleneck ,Modeling and simulation ,Load management ,Computer data storage ,0202 electrical engineering, electronic engineering, information engineering ,business - Abstract
As high-performance computing (HPC) is being scaled up to exascale to accommodate new modeling and simulation needs, I/O has continued to be a major bottleneck in the end-to-end scientific processes. Nevertheless, prior work in this area mostly aimed to maximize the average performance, and there has been a lack of study and solutions that can manage I/O performance variation on HPC systems. This work aims to take advantage of the storage characteristics and explore application level solutions that are interference-aware. In particular, we monitor the performance of data analytics and estimate the state of shared storage resources using discrete fourier transform (DFT). If heavy I/O interference is predicted to occur at a given timestep, data analytics can dynamically adapt to the environment by lowering the accuracy and performing partial or no augmentation from the shared storage, dictated by an augmentation-bandwidth plot. We evaluate three data analytics, XGC, GenASiS, and Jet, on Chameleon, and quantitatively demonstrate that both the average and variation of I/O performance can be vastly improved using our dynamic augmentation, with the mean and variance improved by as much as 67% and 96%, respectively, while maintaining acceptable outcome of data analysis.
- Published
- 2020
- Full Text
- View/download PDF
31. A Comprehensive Study of In-Memory Computing on Large HPC Systems
- Author
-
Qing Liu, Dan Huang, Scott Klasky, Norbert Podhorszki, and Zhenlu Qin
- Subjects
Computer science ,business.industry ,020206 networking & telecommunications ,Usability ,02 engineering and technology ,Supercomputer ,Data science ,Domain (software engineering) ,Dataspaces ,Software portability ,Workflow ,In-Memory Processing ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,business - Abstract
With the increasing fidelity and resolution enabled by high-performance computing systems, simulation-based scientific discovery is able to model and understand microscopic physical phenomena at a level that was not possible in the past. A grand challenge that the HPC community is faced with is how to handle the large amounts of analysis data generated from simulations. In-memory computing, among others, is recognized to be a viable path forward and has experienced tremendous success in the past decade. Nevertheless, there has been a lack of a complete study and understanding of in-memory computing as a whole on HPC systems. This paper presents a comprehensive study, which goes well beyond the typical performance metrics. In particular, we assess the in-memory computing with regard to its usability, portability, robustness and internal design trade-offs, which are the key factors that of interest to domain scientists. We use two realistic scientific workflows, LAMMPS and Laplace, to conduct comprehensive studies on state-of-the-art in-memory computing libraries, including DataSpaces, DIMES, Flexpath and Decaf. We conduct cross-platform experiments at scale on two leading supercomputers, Titan at ORNL and Cori at NERSC, and summarize our key findings in this critical area.
- Published
- 2020
- Full Text
- View/download PDF
32. A terminology for in situ visualization and analysis systems
- Author
-
Aaron Knoll, Paul A. Navrátil, Steve Petruzza, Venkatram Vishwanath, Michel Rasquin, Silvio Rizzi, Jeremy S. Meredith, Thomas Fogal, Jay Lofstead, Bernd Hentschel, David Rogers, James Kress, Han-Wei Shen, Franz Sauer, Cyrus Harrison, Tom Peterka, David Pugmire, Sudhanshu Sane, Charles Hansen, Kenneth Moreland, Berk Geveci, Matthew Wolf, Kwan-Liu Ma, Janine C. Bennett, Rhonda Vickery, William F. Godoy, Sean B. Ziegeler, Ingo Wald, Eric Brugger, Christoph Garth, Steffen Frey, Joseph A. Insley, Jean M. Favre, Andrew Bauer, Soumya Dutta, Gunther H. Weber, Sean Ahern, Matthieu Dorier, Ruonan Wang, John Patchett, E. Wes Bethel, Chris R. Johnson, Valerio Pascucci, Patrick O'Leary, Preeti Malakar, Norbert Podhorszki, Hongfeng Yu, Brad Whitlock, Matthew Larsen, James Ahrens, Robert Sisneros, Joseph A. Cottam, Scott Klasky, Manish Parashar, Hank Childs, Peer-Timo Bremer, and Will Usher
- Subjects
Computer science ,Scientific visualization ,Umbrella term ,020207 software engineering ,02 engineering and technology ,In situ visualization ,Data science ,Theoretical Computer Science ,Visualization ,Term (time) ,Terminology ,In situ processing ,Hardware and Architecture ,Integration Type ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,020201 artificial intelligence & image processing ,medicine.symptom ,Distributed Computing ,Software ,scientific visualization ,Confusion - Abstract
The term “in situ processing” has evolved over the last decade to mean both a specific strategy for visualizing and analyzing data and an umbrella term for a processing paradigm. The resulting confusion makes it difficult for visualization and analysis scientists to communicate with each other and with their stakeholders. To address this problem, a group of over 50 experts convened with the goal of standardizing terminology. This paper summarizes their findings and proposes a new terminology for describing in situ systems. An important finding from this group was that in situ systems are best described via multiple, distinct axes: integration type, proximity, access, division of execution, operation controls, and output type. This paper discusses these axes, evaluates existing systems within the axes, and explores how currently used terms relate to the axes.
- Published
- 2020
33. Processing Full-Scale Square Kilometre Array Data on the Summit Supercomputer
- Author
-
Chen Wu, Norbert Podhorszki, E. Suchyta, Rodrigo Tobar, Fred Dulwich, Baoqiang Lao, Andreas Wicenec, Tao An, Scott Klasky, Markus Dolensky, Valentine G. Anantharaj, and Ruonan Wang
- Subjects
Computer science ,Pipeline (computing) ,Real-time computing ,020206 networking & telecommunications ,02 engineering and technology ,Supercomputer ,01 natural sciences ,Sextant (astronomical) ,law.invention ,Radio telescope ,Pipeline transport ,Telescope ,Workflow ,law ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,010303 astronomy & astrophysics ,Reionization ,Radio astronomy - Abstract
This work presents a workflow for simulating and processing the full-scale low-frequency telescope data of the Square Kilometre Array (SKA) Phase 1. The SKA project will enter the construction phase soon, and once completed, it will be the world’s largest radio telescope and one of the world’s largest data generators. The authors used Summit to mimic an endto-end SKA workflow, simulating a dataset of a typical 6 hour observation and then processing that dataset with an imaging pipeline. This workflow was deployed and run on 4,560 compute nodes, and used 27,360 GPUs to generate 2.6 PB of data. This was the first time that radio astronomical data were processed at this scale. Results show that the workflow has the capability to process one of the key SKA science cases, an Epoch of Reionization observation. This analysis also helps reveal critical design factors for the next-generation radio telescopes and the required dedicated processing facilities.
- Published
- 2020
- Full Text
- View/download PDF
34. Comparing Time-to-Solution for In Situ Visualization Paradigms at Scale
- Author
-
Matthew Wolf, Hank Childs, Jong Choi, Matthew Larsen, Norbert Podhorszki, Scott Klasky, Mark Kim, James Kress, and David Pugmire
- Subjects
Scale (ratio) ,Total cost ,Computer science ,business.industry ,020207 software engineering ,Volume rendering ,02 engineering and technology ,01 natural sciences ,Data modeling ,Data visualization ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Concurrent computing ,Node (circuits) ,Resource management (computing) ,business ,010303 astronomy & astrophysics ,Algorithm - Abstract
This short paper considers time-to-solution for two in situ visualization paradigms: in-line and in-transit. It is a follow-on work to two previous studies. The first study [10] considered time-to-solution (wall clock time) and total cost (total node seconds incurred) for a single visualization algorithm (isosurfacing). The second study [11] considered only total cost and added a second algorithm (volume rendering). This short paper completes the evaluation, considering time-to-solution for both algorithms. In particular, it extends the first study by adding additional insights from including a second algorithm at larger scale and by doing more extended and formal analysis regarding time-to-solution. Further, it complements the second study as the best in situ configuration to choose can vary when considering time-to-solution over cost. It also makes use of the same data corpus used in the second study, although that data corpus has been refactored with time-to-solution in mind.
- Published
- 2020
- Full Text
- View/download PDF
35. Feature-preserving Lossy Compression for In Situ Data Analysis
- Author
-
Scott Klasky, Matthew Wolf, Kshitij Mehta, Igor Yakushin, Jieyang Chen, Ian Foster, and Todd Munson
- Subjects
Computer science ,Middleware (distributed applications) ,Distributed computing ,Bandwidth (computing) ,Lossy compression ,computer.software_genre ,computer ,Pipeline (software) - Abstract
The traditional model of having simulations write data to disk for offline analysis can be prohibitively expensive on computers with limited storage capacity or I/O bandwidth. In situ data analysis has emerged as a necessary paradigm to address this issue and is expected to play an important role in exascale computing. We demonstrate the various aspects and challenges involved in setting up a comprehensive in situ data analysis pipeline that consists of a simulation coupled with compression and feature tracking routines, a framework for assessing compression quality, a middleware library for I/O and data management, and a workflow tool for composing and running the pipeline. We perform studies of compression mechanisms and parameters on two supercomputers, Summit at Oak Ridge National Laboratory and Theta at Argonne National Laboratory, for two example application pipelines. We show that the optimal choice of compression parameters varies with data, time, and analysis, and that periodic retuning of the in situ pipeline can improve compression quality. Finally, we discuss our perspective on the wider adoption of in situ data analysis and management practices and technologies in the HPC community.
- Published
- 2020
- Full Text
- View/download PDF
36. ADIOS 2: The Adaptable Input Output System. A framework for high-performance data management
- Author
-
Mark Kim, Seiji Tsutsumi, George Ostrouchov, James Kress, Keichi Takahashi, Lipeng Wan, Kesheng Wu, Norbert Podhorszki, Kshitij Mehta, Kai Germaschewski, Franz Poeschel, Scott Klasky, Ruonan Wang, Chuck Atkins, Jong Choi, Matthew Wolf, Qing Liu, David Pugmire, Jeremy Logan, William F. Godoy, Philip E. Davis, Manish Parashar, Junmin Gu, Nicholas Thompson, E. Suchyta, Kevin Huck, Greg Eisenhauer, Axel Huebl, and Tahsin Kurc
- Subjects
Staging ,Computer science ,Fortran ,Data management ,Scalable I/O ,computer.software_genre ,01 natural sciences ,Data science ,03 medical and health sciences ,Exascale computing ,Luster GPFS file systems ,0103 physical sciences ,010306 general physics ,MATLAB ,030304 developmental biology ,computer.programming_language ,lcsh:Computer software ,0303 health sciences ,Application programming interface ,business.industry ,Programming language ,In-situ ,Python (programming language) ,Supercomputer ,Computer Science Applications ,lcsh:QA76.75-76.765 ,Personal computer ,RDMA ,business ,High-performance computing (HPC) ,computer ,Software - Abstract
Author(s): Godoy, WF; Podhorszki, N; Wang, R; Atkins, C; Eisenhauer, G; Gu, J; Davis, P; Choi, J; Germaschewski, K; Huck, K; Huebl, A; Kim, M; Kress, J; Kurc, T; Liu, Q; Logan, J; Mehta, K; Ostrouchov, G; Parashar, M; Poeschel, F; Pugmire, D; Suchyta, E; Takahashi, K; Thompson, N; Tsutsumi, S; Wan, L; Wolf, M; Wu, K; Klasky, S | Abstract: We present ADIOS 2, the latest version of the Adaptable Input Output (I/O) System. ADIOS 2 addresses scientific data management needs ranging from scalable I/O in supercomputers, to data analysis in personal computer and cloud systems. Version 2 introduces a unified application programming interface (API) that enables seamless data movement through files, wide-area-networks, and direct memory access, as well as high-level APIs for data analysis. The internal architecture provides a set of reusable and extendable components for managing data presentation and transport mechanisms for new applications. ADIOS 2 bindings are available in C++11, C, Fortran, Python, and Matlab and are currently used across different scientific communities. ADIOS 2 provides a communal framework to tackle data management challenges as we approach the exascale era of supercomputing.
- Published
- 2020
37. Orchestrating Fault Prediction with Live Migration and Checkpointing
- Author
-
Lipeng Wan, Frank Mueller, Subhendu Behera, Matthew Wolf, and Scott Klasky
- Subjects
File system ,020203 distributed computing ,Computer science ,Distributed computing ,Fault tolerance ,02 engineering and technology ,computer.software_genre ,Fault (power engineering) ,Supercomputer ,020202 computer hardware & architecture ,Reduction (complexity) ,Overhead (business) ,0202 electrical engineering, electronic engineering, information engineering ,computer ,Live migration - Abstract
Checkpoint/Restart (C/R) is widely used to provide fault tolerance on High-Performance Computing (HPC) systems. However, Parallel File System (PFS) overhead and failure uncertainty cause significant application overhead. This paper develops an adaptive multi-level C/R model that incorporates a failure prediction and analysis model, which orchestrates failure prediction, checkpointing, checkpoint frequency, and proactive live migration along with the additional benefit of Burst Buffers (BB). It effectively reduces the overheads due to failures, checkpointing, and recovery. Simulation results for the Summit supercomputer yield a reduction of ~20%-86% in application overhead due to BBs, orchestrated failure prediction, and migration. We also observe a ~29% decrease in checkpoint writes to BBs, which can increase the longevity of the BB storage devices.
- Published
- 2020
- Full Text
- View/download PDF
38. Multilevel Techniques for Compression and Reduction of Scientific Data-Quantitative Control of Accuracy in Derived Quantities
- Author
-
Mark Ainsworth, Scott Klasky, Ozan Tugluk, and Ben Whitney
- Subjects
Pointwise ,business.industry ,Applied Mathematics ,Big data ,010103 numerical & computational mathematics ,01 natural sciences ,Reduction (complexity) ,Computational Mathematics ,Compression (functional analysis) ,Mathematics::Metric Geometry ,0101 mathematics ,business ,Algorithm ,Data compression ,Data reduction ,Mathematics - Abstract
Although many compression algorithms are focused on preserving pointwise values of the data, application scientists are generally more concerned with derived quantities. Equally well, the user may ...
- Published
- 2019
- Full Text
- View/download PDF
39. Multilevel Techniques for Compression and Reduction of Scientific Data---The Multivariate Case
- Author
-
Mark Ainsworth, Scott Klasky, Ozan Tugluk, and Ben Whitney
- Subjects
Multivariate statistics ,Applied Mathematics ,010103 numerical & computational mathematics ,01 natural sciences ,Reduction (complexity) ,Computational Mathematics ,Multigrid method ,Tensor product ,Compression (functional analysis) ,0101 mathematics ,Algorithm ,Data reduction ,Mathematics ,Data compression - Abstract
We develop a technique for multigrid adaptive reduction of data (MGARD). Special attention is given to the case of tensor product grids, where our approach permits the use of nonuniformly spaced gr...
- Published
- 2019
- Full Text
- View/download PDF
40. Organizing Large Data Sets for Efficient Analyses on HPC Systems
- Author
-
Junmin Gu, Philip Davis, Greg Eisenhauer, William Godoy, Axel Huebl, Scott Klasky, Manish Parashar, Norbert Podhorszki, Franz Poeschel, JeanLuc Vay, Lipeng Wan, Ruonan Wang, and Kesheng Wu
- Subjects
History ,Computer Science Applications ,Education - Abstract
Upcoming exascale applications could introduce significant data management challenges due to their large sizes, dynamic work distribution, and involvement of accelerators such as graphical processing units, GPUs. In this work, we explore the performance of reading and writing operations involving one such scientific application on two different supercomputers. Our tests showed that the Adaptable Input and Output System, ADIOS, was able to achieve speeds over 1TB/s, a significant fraction of the peak I/O performance on Summit. We also demonstrated the querying functionality in ADIOS could effectively support common selective data analysis operations, such as conditional histograms. In tests, this query mechanism was able to reduce the execution time by a factor of five. More importantly, ADIOS data management framework allows us to achieve these performance improvements with only a minimal amount of coding effort.
- Published
- 2022
- Full Text
- View/download PDF
41. Multilevel techniques for compression and reduction of scientific data—the univariate case
- Author
-
Ozan Tugluk, Ben Whitney, Mark Ainsworth, and Scott Klasky
- Subjects
Flexibility (engineering) ,Scale (ratio) ,Computer science ,General Engineering ,Univariate ,020207 software engineering ,010103 numerical & computational mathematics ,02 engineering and technology ,01 natural sciences ,Theoretical Computer Science ,Visualization ,Reduction (complexity) ,Computational Theory and Mathematics ,Modeling and Simulation ,Compression (functional analysis) ,0202 electrical engineering, electronic engineering, information engineering ,Range (statistics) ,Computer Vision and Pattern Recognition ,0101 mathematics ,Representation (mathematics) ,Algorithm ,Software - Abstract
We present a multilevel technique for the compression and reduction of univariate data and give an optimal complexity algorithm for its implementation. A hierarchical scheme offers the flexibility to produce multiple levels of partial decompression of the data so that each user can work with a reduced representation that requires minimal storage whilst achieving the required level of tolerance. The algorithm is applied to the case of turbulence modelling in which the datasets are traditionally not only extremely large but inherently non-smooth and, as such, rather resistant to compression. We decompress the data for a range of relative errors, carry out the usual analysis procedures for turbulent data, and compare the results of the analysis on the reduced datasets to the results that would be obtained on the full dataset. The results obtained demonstrate the promise of multilevel compression techniques for the reduction of data arising from large scale simulations of complex phenomena such as turbulence modelling.
- Published
- 2018
- Full Text
- View/download PDF
42. SIRIUS: Enabling Progressive Data Exploration for Extreme-Scale Scientific Data
- Author
-
Tao Lu, Scott Klasky, Norbert Podhorszki, Huizhang Luo, Jinzhen Wang, Zhenbo Qiao, and Qing Liu
- Subjects
Decimation ,business.industry ,Computer science ,Data management ,Feature extraction ,computer.software_genre ,Supercomputer ,Data modeling ,Data model ,Hardware and Architecture ,Control and Systems Engineering ,Computer data storage ,Data analysis ,Data mining ,business ,computer ,Information Systems - Abstract
Scientific simulations on high performance computing (HPC) platforms generate large quantities of data. To bridge the widening gap between compute and I/O, and enable data to be more efficiently stored and analyzed, simulation outputs need to be refactored, reduced, and appropriately mapped to storage tiers. However, a systematic solution to support these steps has been lacking in the current HPC software ecosystem. To that end, this paper develops SIRIUS, a progressive JPEG-like data management scheme for storing and analyzing big scientific data. It co-designs data decimation, compression, and data storage, taking the hardware characteristics of each storage tier into considerations. With reasonably low overhead, our approach refactors simulation data, using either topological or uniform decimation, into a much smaller, reduced-accuracy base dataset, and a series of deltas that is used to augment the accuracy if needed. The base dataset and deltas are compressed and written to multiple storage tiers. Data saved on different tiers can then be selectively retrieved to restore the level of accuracy that satisfies data analytics. Thus, SIRIUS provides a paradigm shift towards elastic data analytics and enables end users to make trade-offs between analysis speed and accuracy on-the-fly. This paper further develops algorithms to preserve statistics for data decimation, a common requirement for reducing data. We assess the impact of SIRIUS on unstructured triangular meshes, a pervasive data model used in scientific simulations. In particular, we evaluate two realistic use cases: the blob detection in fusion and high-pressure area extraction in computational fluid dynamics.
- Published
- 2018
- Full Text
- View/download PDF
43. Personalized Search Inspired Fast Interactive Estimation of Distribution Algorithm and Its Application
- Author
-
Dunwei Gong, Yang Chen, Yong Zhang, Jong Choi, Xiaoyan Sun, and Scott Klasky
- Subjects
0209 industrial biotechnology ,Computer science ,business.industry ,Probabilistic logic ,Evolutionary algorithm ,Statistical model ,02 engineering and technology ,Bayesian inference ,Machine learning ,computer.software_genre ,Theoretical Computer Science ,Personalized search ,020901 industrial engineering & automation ,Computational Theory and Mathematics ,Estimation of distribution algorithm ,0202 electrical engineering, electronic engineering, information engineering ,Domain knowledge ,020201 artificial intelligence & image processing ,Artificial intelligence ,Data mining ,business ,computer ,Software ,Subspace topology - Abstract
Interactive evolutionary algorithms have been applied to personalized search, in which less user fatigue and efficient search are pursued. Motivated by this, we present a fast interactive estimation of distribution algorithm (IEDA) by using the domain knowledge of personalized search. We first induce a Bayesian model to describe the distribution of the new user’s preference on the variables from the social knowledge of personalized search. Then we employ the model to enhance the performance of IEDA in two aspects, that is: 1) dramatically reducing the initial huge space to a preferred subspace and 2) generating the individuals of estimation of distribution algorithm(EDA) by using it as a probabilistic model. The Bayesian model is updated along with the implementation of the EDA. To effectively evaluate individuals, we further present a method to quantitatively express the preference of the user based on the human-computer interactions and train a radial basis function neural network as the fitness surrogate. The proposed algorithm is applied to a laptop search, and its superiorities in alleviating user fatigue and speeding up the search procedure are empirically demonstrated.
- Published
- 2017
- Full Text
- View/download PDF
44. Leading magnetic fusion energy science into the big-and-fast data lane
- Author
-
Minjun Choi, Jong Youl Choi, Ruonan Wang, Choong-Seock Chang, Ralph Kube, Scott Klasky, R Michael Churchill, and Jinseop Park
- Subjects
Nuclear physics ,Magnetic fusion ,Computer science ,Energy (signal processing) - Published
- 2020
- Full Text
- View/download PDF
45. Machine Learning for the Complex, Multi-scale Datasets in Fusion Energy
- Author
-
Jong Choi, R. Michael Churchill, C. S. Chang, Scott Klasky, and Ralph Kube
- Subjects
Sequence ,business.industry ,Deep learning ,Scale (chemistry) ,Markov chain Monte Carlo ,Fusion power ,Machine learning ,computer.software_genre ,Acceleration ,Range (mathematics) ,symbols.namesake ,Key (cryptography) ,symbols ,Artificial intelligence ,business ,computer - Abstract
ML/AI techniques, particularly based on deep learning, will increasingly be used to accelerate scientific discovery for fusion experiment and simulation. Fusion energy devices have many disparate diagnostic instruments, capturing a broad range of interacting physics phenomena over multiple time and spatial scales. Also, fusion experiments are increasingly built to run longer pulses, with a goal of eventually running a reactor continuously. The confluence of these facts leads to large, complex datasets with phenomena manifest over long sequences. A key challenge is enabling scientists/engineers to utilize these datasets, for example to automatically catalog events of interest, predict the onset of phenomena such as tokamak disruptions, and enable comparisons to models/simulation. Given the size, multiple modalities, and multi-scale nature of fusion data, deep learning models are attractive, but at these scales requires utilizing HPC resources. Many ML/AI techniques not fully utilized now will demand even more HPC resources, such as self-supervised learning to help fusion scientists create AI models with less labelled data, and advanced sequence models which use less GPU memory at the expense of increased compute. Additionally, deep learning models will enable faster, more in-depth analysis than previously available, such as extracting physics model parameters from data using conditional variational autoencoders, instead of slower techniques such as Markov chain Monte Carlo (MCMC). Comparison to simulation will also be enhanced through direct acceleration of simulation kernels using deep learning. These ML/AI techniques will give fusion scientists faster results, allowing more efficient machine use, and faster scientific discovery.
- Published
- 2020
- Full Text
- View/download PDF
46. Visualization as a Service for Scientific Data
- Author
-
Scott Klasky, Matthew Wolf, Berk Geveci, Lipeng Wan, Dmitry Ganyushin, Jong Choi, Jeremy Logan, E. Suchyta, Jieyang Chen, Kshitij Mehta, Norbert Podhorszki, Nicholas Thompson, Hank Childs, Steven Walton, Xin Liang, David Pugmire, James Kress, Caitlin Ross, Nicole Marsaglia, and Mark Kim
- Subjects
Flexibility (engineering) ,Service (systems architecture) ,Workflow ,Process (engineering) ,business.industry ,Computer science ,Interoperability ,Scientific visualization ,Use case ,Software engineering ,business ,Visualization - Abstract
One of the primary challenges facing scientists is extracting understanding from the large amounts of data produced by simulations, experiments, and observational facilities. The use of data across the entire lifetime ranging from real-time to post-hoc analysis is complex and varied, typically requiring a collaborative effort across multiple teams of scientists. Over time, three sets of tools have emerged: one set for analysis, another for visualization, and a final set for orchestrating the tasks. This trifurcated tool set often results in the manual assembly of analysis and visualization workflows, which are one-off solutions that are often fragile and difficult to generalize. To address these challenges, we propose a serviced-based paradigm and a set of abstractions to guide its design. These abstractions allow for the creation of services that can access and interpret data, and enable interoperability for intelligent scheduling of workflow systems. This work results from a codesign process over analysis, visualization, and workflow tools to provide the flexibility required for production use. Finally, this paper describes a forward-looking research and development plan that centers on the concept of visualization and analysis technology as reusable services, and also describes several real-world use cases that implement these concepts.
- Published
- 2020
- Full Text
- View/download PDF
47. Spatial core-edge coupling of the particle-in-cell gyrokinetic codes GEM and XGC
- Author
-
Frank Jenko, Gabriele Merlo, Scott Klasky, Haotian Chen, Amitava Bhattacharjee, Junyi Cheng, Sarat Sreepathi, Stephane Ethier, Seung-Hoe Ku, Robert Hager, Choong-Seock Chang, E. Suchyta, Yang Chen, Scott Parker, Eduardo D'Azevedo, and Julien Dominski
- Subjects
Coupling ,Physics ,Physics::Instrumentation and Detectors ,Interface (Java) ,Edge region ,Graphics processing unit ,Edge (geometry) ,Condensed Matter Physics ,01 natural sciences ,010305 fluids & plasmas ,Computational science ,Core (optical fiber) ,0103 physical sciences ,Polygon mesh ,Particle-in-cell ,010306 general physics - Abstract
Two existing particle-in-cell gyrokinetic codes, GEM for the core region and XGC for the edge region, have been successfully coupled with a spatial coupling scheme at the interface in a toroidal geometry. A mapping technique is developed for transferring data between GEM's structured and XGC's unstructured meshes. Two examples of coupled simulations are presented to demonstrate the coupling scheme. The optimization of GEM for graphics processing unit is also presented.
- Published
- 2020
48. Data Federation Challenges in Remote Near-Real-Time Fusion Experiment Data Processing
- Author
-
Ruonan Wang, Kshitij Mehta, Greg Eisenhauer, Jong Choi, Ralph Kube, Minjun Choi, Norbert Podhorszki, Jeremy Logan, Scott Klasky, C. S. Chang, Matthew Wolf, R. Michael Churchill, and Jinseop Park
- Subjects
Range (mathematics) ,Data processing ,Workflow ,Computer science ,Data stream mining ,Real-time computing ,Volume (computing) ,Enhanced Data Rates for GSM Evolution ,Fusion power ,Variety (cybernetics) - Abstract
Fusion energy experiments and simulations provide critical information needed to plan future fusion reactors. As next-generation devices like ITER move toward long-pulse experiments, analyses, including AI and ML, should be performed in a wide range of time and computing constraints, from near-real-time constraints, between-shot analysis, and to campaign-wide long-term analysis. However, the data volume, velocity, and variety make it extremely challenging for analyses using only local computational resources. Researchers need the ability to compose and execute workflows spanning edge resources to large-scale high-performance computing facilities.
- Published
- 2020
- Full Text
- View/download PDF
49. Opportunities for Cost Savings with In-Transit Visualization
- Author
-
Hank Childs, Jong Choi, James Kress, Norbert Podhorszki, David Pugmire, Matthew Larsen, Matthew Wolf, Mark Kim, and Scott Klasky
- Subjects
Computer engineering ,Computer science ,0202 electrical engineering, electronic engineering, information engineering ,020207 software engineering ,020201 artificial intelligence & image processing ,02 engineering and technology ,Cost savings ,Visualization - Abstract
We analyze the opportunities for in-transit visualization to provide cost savings compared to in-line visualization. We begin by developing a cost model that includes factors related to both in-line and in-transit which allows comparisons to be made between the two methods. We then run a series of studies to create a corpus of data for our model. We run two different visualization algorithms, one that is computation heavy and one that is communication heavy with concurrencies up to 32, 768 cores. Our primary results are in exploring the cost model within the context of our corpus. Our findings show that in-transit consistently achieves significant cost efficiencies by running visualization algorithms at lower concurrency, and that in many cases these efficiencies are enough to offset other costs (transfer, blocking, and additional nodes) to be cost effective overall. Finally, this work informs future studies, which can focus on choosing ideal configurations for in-transit processing that can consistently achieve cost efficiencies.
- Published
- 2020
- Full Text
- View/download PDF
50. Understanding Performance-Quality Trade-offs in Scientific Visualization Workflows with Lossy Compression
- Author
-
Ben Whitney, Jong Youl Choi, David Pugmire, Nicholas Thompson, Jeremy Logan, Scott Klasky, Kshitij Mehta, Jieyang Chen, Matthew Wolf, and Lipeng Wan
- Subjects
Workflow ,Computer science ,Trade offs ,Scientific visualization ,Lossy compression ,Data science ,Storage efficiency ,Visualization ,Performance quality - Abstract
The cost of I/O is a significant challenge on current supercomputers, and the trend is likely to continue into the foreseeable future. This challenge is amplified in scientific visualization because of the requirement to consume large amounts of data before processing can begin. Lossy compression has become an important technique in reducing the cost of performing I/O. In this paper we consider the implications of using compressed data for visualization within a scientific workflow. We use visualization operations on simulation data that is reduced using three different state-of-the-art compression techniques. We study the storage efficiency and preservation of visualization features on the resulting compressed data, and draw comparisons between the three techniques used. Our contributions can help inform both scientists and researchers in the use and design of compression techniques for preservation of important visualization details.
- Published
- 2019
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.