Author: "Travis Johnston" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Travis Johnston"' showing total 56 results

Start Over Author "Travis Johnston"

56 results on '"Travis Johnston"'

1. High frequency accuracy and loss data of random neural networks trained on image datasets

Author: Ariel Keller Rorabaugh, Silvina Caíno-Lores, Travis Johnston, and Michela Taufer
Subjects: Loss curve, Accuracy curve, Classification, Performance prediction, Early stopping, Neural architecture search, Computer applications to medicine. Medical informatics, R858-859.7, Science (General), Q1-390
Abstract: Neural Networks (NNs) are increasingly used across scientific domains to extract knowledge from experimental or computational data. An NN is composed of natural or artificial neurons that serve as simple processing units and are interconnected into a model architecture; it acquires knowledge from the environment through a learning process and stores this knowledge in its connections. The learning process is conducted by training. During NN training, the learning process can be tracked by periodically validating the NN and calculating its fitness. The resulting sequence of fitness values (i.e., validation accuracy or validation loss) is called the NN learning curve. The development of tools for NN design requires knowledge of diverse NNs and their complete learning curves.Generally, only final fully-trained fitness values for highly accurate NNs are made available to the community, hampering efforts to develop tools for NN design and leaving unaddressed aspects such as explaining the generation of an NN and reproducing its learning process. Our dataset fills this gap by fully recording the structure, metadata, and complete learning curves for a wide variety of random NNs throughout their training. Our dataset captures the lifespan of 6000 NNs throughout generation, training, and validation stages. It consists of a suite of 6000 tables, each table representing the lifespan of one NN. We generate each NN with randomized parameter values and train it for 40 epochs on one of three diverse image datasets (i.e., CIFAR-100, FashionMNIST, SVHN). We calculate and record each NN’s fitness with high frequency—every half epoch—to capture the evolution of the training and validation process. As a result, for each NN, we record the generated parameter values describing the structure of that NN, the image dataset on which the NN trained, and all loss and accuracy values for the NN every half epoch.We put our dataset to the service of researchers studying NN performance and its evolution throughout training and validation. Statistical methods can be applied to our dataset to analyze the shape of learning curves in diverse NNs, and the relationship between an NN’s structure and its fitness. Additionally, the structural data and metadata that we record enable the reconstruction and reproducibility of the associated NN.
Published: 2022
Full Text: View/download PDF

2. Accurate and Accelerated Neuromorphic Network Design Leveraging A Bayesian Hyperparameter Pareto Optimization Approach.

Author: Maryam Parsa, Catherine D. Schuman, Nitin Rathi, Amirkoushyar Ziabari, Derek C. Rose, J. Parker Mitchell, J. Travis Johnston, Bill Kay, Steven R. Young, and Kaushik Roy 0001
Published: 2021
Full Text: View/download PDF

3. Low Size, Weight, and Power Neuromorphic Computing to Improve Combustion Engine Efficiency.

Author: Catherine D. Schuman, Steven R. Young, J. Parker Mitchell, J. Travis Johnston, Derek C. Rose, Bryan P. Maldonado, and Brian C. Kaul
Published: 2020
Full Text: View/download PDF

4. Structure Prediction from Neutron Scattering Profiles: A Data Sciences Approach.

Author: Cristina Garcia-Cardona, Ramakrishnan Kannan, Travis Johnston, Thomas Proffen, and Sudip K. Seal
Published: 2020
Full Text: View/download PDF

5. Resilience and Robustness of Spiking Neural Networks for Neuromorphic Systems.

Author: Catherine D. Schuman, J. Parker Mitchell, J. Travis Johnston, Maryam Parsa, Bill Kay, Prasanna Date, and Robert M. Patton
Published: 2020
Full Text: View/download PDF

6. Exascale Deep Learning to Accelerate Cancer Research.

Author: Robert M. Patton, Shahira Abousamra, Dimitris Samaras, Joel H. Saltz, J. Travis Johnston, Steven R. Young, Catherine D. Schuman, Thomas E. Potok, Derek C. Rose, Seung-Hwan Lim, Junghoon Chae, and Le Hou
Published: 2019
Full Text: View/download PDF

7. Evolving Energy Efficient Convolutional Neural Networks.

Author: Steven R. Young, Pravallika Devineni, Maryam Parsa, J. Travis Johnston, Bill Kay, Robert M. Patton, Catherine D. Schuman, Derek C. Rose, and Thomas E. Potok
Published: 2019
Full Text: View/download PDF

8. Visualization System for Evolutionary Neural Networks for Deep Learning.

Author: Junghoon Chae, Catherine D. Schuman, Steven R. Young, J. Travis Johnston, Derek C. Rose, Robert M. Patton, and Thomas E. Potok
Published: 2019
Full Text: View/download PDF

9. Learning to Predict Material Structure from Neutron Scattering Data.

Author: Cristina Garcia-Cardona, Ramakrishnan Kannan, Travis Johnston, Thomas Proffen, Katharine Page, and Sudip K. Seal
Published: 2019
Full Text: View/download PDF

10. A Novel Pruning Method for Convolutional Neural Networks Based off Identifying Critical Filters.

Author: Mihaela Dimovska and Travis Johnston
Published: 2019
Full Text: View/download PDF

11. Multi-Objective Optimization for Size and Resilience of Spiking Neural Networks.

Author: Mihaela Dimovska, Travis Johnston, Catherine D. Schuman, J. Parker Mitchell, and Thomas E. Potok
Published: 2019
Full Text: View/download PDF

12. 167-PFlops deep learning for electron microscopy: from learning physics to atomic manipulation.

Author: Robert M. Patton, J. Travis Johnston, Steven R. Young, Catherine D. Schuman, Don D. March, Thomas E. Potok, Derek C. Rose, Seung-Hwan Lim, Thomas P. Karnowski, Maxim A. Ziatdinov, and Sergei V. Kalinin
Published: 2018

13. Evolving Deep Networks Using HPC.

Author: Steven R. Young, Derek C. Rose, J. Travis Johnston, William T. Heller, Thomas P. Karnowski, Thomas E. Potok, Robert M. Patton, Gabriel N. Perdue, and Jonathan A. Miller
Published: 2017
Full Text: View/download PDF

14. Optimizing Convolutional Neural Networks for Cloud Detection.

Author: J. Travis Johnston, Steven R. Young, David Hughes, Robert M. Patton, and Devin White
Published: 2017
Full Text: View/download PDF

15. HYPPO: A Hybrid, Piecewise Polynomial Modeling Technique for Non-Smooth Surfaces.

Author: Travis Johnston, Connor Zanin, and Michela Taufer
Published: 2016
Full Text: View/download PDF

16. Development of a Scalable Method for Creating Food Groups Using the NHANES Dataset and MapReduce.

Author: Michael R. Wyatt II, Travis Johnston, Mia Papas, and Michela Taufer
Published: 2016
Full Text: View/download PDF

17. On the Need for Reproducible Numerical Accuracy through Intelligent Runtime Selection of Reduction Algorithms at the Extreme Scale.

Author: Dylan Chapp, Travis Johnston, and Michela Taufer
Published: 2015
Full Text: View/download PDF

18. Performance Tuning of MapReduce Jobs Using Surrogate-based Modeling.

Author: Travis Johnston, Mohammad Alsulmi, Pietro Cicotti, and Michela Taufer
Published: 2015
Full Text: View/download PDF

19. Co-design Center for Exascale Machine Learning Technologies (ExaLearn)

Author: Shinjae Yoo, Logan Ward, Nikoli Dryden, Ramakrishnan Kannan, Rajeev Thakur, Bert Debusschere, Ganesh Sivaraman, Sutanay Choudhury, Zhengchun Liu, Neeraj Kumar, Peter Nugent, Francis J. Alexander, Sudip K. Seal, Shantenu Jha, James A. Ang, David Pugmire, Li Tan, Ian Foster, Yunzhi Huang, Paul M. Welch, Cristina Garcia Cardona, Sivasankaran Rajamanickam, Thomas Proffen, Ai Kagawa, Malachi Schram, Byung-Jun Yoon, Jamaludin Mohd-Yusof, Erin McCarthy, Tiernan Casey, Sotiris S. Xantheas, Vinay Ramakrishniah, Jan Balewski, Sayan Ghosh, Brian Van Essen, Michael M. Wolf, Christine Sweeney, J. Austin Ellis, Peter Harrington, Jong Choi, Yosuke Oyama, Naoya Maruyama, Satoshi Matsuoka, Jenna A. Bilbrey, Kevin G. Yager, Anthony M. DeGennaro, Travis Johnston, and Ryan Chard
Subjects: Co-design, ComputerSystemsOrganization_COMPUTERSYSTEMIMPLEMENTATION, Active learning (machine learning), Statistical learning, Computer science, business.industry, Machine learning, computer.software_genre, Exascale computing, Theoretical Computer Science, Hardware and Architecture, Reinforcement learning, Center (algebra and category theory), Artificial intelligence, business, computer, Software
Abstract: Rapid growth in data, computational methods, and computing power is driving a remarkable revolution in what variously is termed machine learning (ML), statistical learning, computational learning, and artificial intelligence. In addition to highly visible successes in machine-based natural language translation, playing the game Go, and self-driving cars, these new technologies also have profound implications for computational and experimental science and engineering, as well as for the exascale computing systems that the Department of Energy (DOE) is developing to support those disciplines. Not only do these learning technologies open up exciting opportunities for scientific discovery on exascale systems, they also appear poised to have important implications for the design and use of exascale computers themselves, including high-performance computing (HPC) for ML and ML for HPC. The overarching goal of the ExaLearn co-design project is to provide exascale ML software for use by Exascale Computing Project (ECP) applications, other ECP co-design centers, and DOE experimental facilities and leadership class computing facilities.
Published: 2021

20. Structure Prediction from Scattering Profiles: A Neutron-Scattering Use-Case

Author: Cristina Garcia-Cardona, Ramakrishnan Kannan, Travis Johnston, Thomas Proffen, and Sudip K. Seal
Published: 2022

21. Classifying and analyzing small-angle scattering data using weighted k nearest neighbors machine learning techniques

Author: William T. Heller, Erika Yang, Steven R. Young, Travis Johnston, Rick Archibald, and Mathieu Doucet
Subjects: business.industry, Computer science, Data classification, 010403 inorganic & nuclear chemistry, Machine learning, computer.software_genre, 01 natural sciences, General Biochemistry, Genetics and Molecular Biology, 0104 chemical sciences, k-nearest neighbors algorithm, 010104 statistics & probability, symbols.namesake, Surrogate model, Stochastic gradient descent, symbols, Leverage (statistics), Artificial intelligence, 0101 mathematics, Small-angle scattering, business, Gaussian process, computer, Test data
Abstract: A consistent challenge for both new and expert practitioners of small-angle scattering (SAS) lies in determining how to analyze the data, given the limited information content of said data and the large number of models that can be employed. Machine learning (ML) methods are powerful tools for classifying data that have found diverse applications in many fields of science. Here, ML methods are applied to the problem of classifying SAS data for the most appropriate model to use for data analysis. The approach employed is built around the method of weighted k nearest neighbors (wKNN), and utilizes a subset of the models implemented in the SasView package (https://www.sasview.org/) for generating a well defined set of training and testing data. The prediction rate of the wKNN method implemented here using a subset of SasView models is reasonably good for many of the models, but has difficulty with others, notably those based on spherical structures. A novel expansion of the wKNN method was also developed, which uses Gaussian processes to produce local surrogate models for the classification, and this significantly improves the classification accuracy. Further, by integrating a stochastic gradient descent method during post-processing, it is possible to leverage the local surrogate model both to classify the SAS data with high accuracy and to predict the structural parameters that best describe the data. The linking of data classification and model fitting has the potential to facilitate the translation of measured data into results for both novice and expert practitioners of SAS.
Published: 2020

22. True Blues: The Contentious Transformation of the Democratic Party. By Adam Hilton. Philadelphia: University of Pennsylvania Press, 2021. 280p. $55.00 cloth

Author: Travis Johnston
Subjects: Political Science and International Relations
Published: 2022

23. Accurate and Accelerated Neuromorphic Network Design Leveraging A Bayesian Hyperparameter Pareto Optimization Approach

Author: Kaushik Roy, Bill Kay, Amir Ziabari, Steven R. Young, J. Parker Mitchell, Nitin Rathi, Catherine D. Schuman, Derek C. Rose, Maryam Parsa, and Travis Johnston
Subjects: Hyperparameter, Spiking neural network, Optimization problem, Artificial neural network, business.industry, Computer science, Bayesian optimization, Machine learning, computer.software_genre, Multi-objective optimization, Neuromorphic engineering, Hyperparameter optimization, Artificial intelligence, business, computer
Abstract: Neuromorphic systems allow for extremely efficient hardware implementations for neural networks (NNs). In recent years, several algorithms have been presented to train spiking NNs (SNNs) for neuromorphic hardware. However, SNNs often provide lower accuracy than their artificial NNs (ANNs) counterparts or require computationally expensive and slow training/inference methods. To close this gap, designers typically rely on reconfiguring SNNs through adjustments in the neuron/synapse model or training algorithm itself. Nevertheless, these steps incur significant design time, while still lacking the desired improvement in terms of training/inference times (latency). Designing SNNs that can mimic the accuracy of ANNs with reasonable training times is an exigent challenge in neuromorphic computing. In this work, we present an alternative approach that looks at such designs as an optimization problem rather than algorithm or architecture redesign. We develop a versatile multiobjective hyperparameter optimization (HPO) for automatically tuning HPs of two state-of-the-art SNN training algorithms, SLAYER and HYBRID. We emphasize that, to the best of our knowledge, this is the first work trying to improve SNNs’ computational efficiency, accuracy, and training time using an efficient HPO. We demonstrate significant performance improvements for SNNs on several datasets without the need to redesign or invent new training algorithms/architectures. Our approach results in more accurate networks with lower latency and, in turn, higher energy efficiency than previous implementations. In particular, we demonstrate improvement in accuracy and more than 5 × reduction in the training/inference time for the SLAYER algorithm on the DVS Gesture dataset. In the case of HYBRID, we demonstrate 30% reduction in timesteps while surpassing the accuracy of the state-of-the-art networks on CIFAR10. Further, our analysis suggests that even a seemingly minor change in HPs could change the accuracy by 5 − 6 ×.
Published: 2021

24. Data Fusion: A Project Update & Pathway Forward

Author: Christopher Perullo, Travis Johnston, You-Hai Wen, David Alman, Yong Liu, Sangkeun Lee, Salvatore Della Villa, Robert Steele, and Dongwon Shin
Subjects: Electricity generation, Power station, Turbomachinery, Profitability index, Sensor fusion, Combustion, Reliability (statistics), Turbocharger, Reliability engineering
Abstract: At the Turbo Expo 2018: Turbomachinery Conference & Expedition, in Oslo, Norway, an innovative approach for assessing operating and near real-time data from power generating assets with meaningful predictive analytics was presented and discussed. GT2018-75030, entitled; Energy Innovation: A Focus on Power Generation Data Capture & Analytics in a Competitive Market established a challenging objective for the industry: “To advance the notion that the fusion of total plant data, from three primary sources, with the ability to transform, analyze, and act based on integrating subject matter expertise is essential for effectively managing assets for optimum performance and profitability; executing and delivering on the promise of “Big Data” and advanced analytics.” Throughout 2019 and 2020, a team comprised of members from Strategic Power Systems, Inc. ® (SPS), Turbine Logic (TL), and two National Labs; National Energy Technology Laboratory (NETL) and Oak Ridge National Laboratory (ORNL), collaborated on the paper’s hypothesis. The team worked with the support of funding from DOE’s Fossil Energy Program through its HPC4 Materials Program, which provided access to the High-Performance Computing assets at both laboratories. The team brought unique skills, strengths, and capabilities that would serve as the basis for an effective, open, and challenging collaboration. The engineering and data science disciplines that converged on this project provided the back-bone for the unbiased analysis and model building that took place; relying on a unique and up-to-date source of plant operating and design data essential for performing the engineering scope of work. A key objective was to use the data and the modeling to be predictive; to characterize remaining life, expended life, and to determine the “next failure” for critical systems and components. Proof-of-concepts were tested for longer term, data-driven reliability prediction for fleets of power generating assets, near real-time prediction of power plant faults which could lead to imminent failure, and physics-based model prediction of life consumption of critical parts. Each of these pilot scale projects is summarized with key results presented.
Published: 2021

25. The Securitization of Refugees: A Critical Media Discourse Analysis of the Reporting on Syrian Refugees in Canada

Author: Travis Johnston
Subjects: National security, business.industry, media_common.quotation_subject, Discourse analysis, Refugee, Immigration, Criminology, Newspaper, Spanish Civil War, Political science, Securitization, business, media_common, Moral panic
Abstract: While immigration had become securitized pre-9/11, the terror attacks on that day accelerated the moral panic in society to new levels creating greater fear of mobility and its perceived relation to threats against national security. Following the outbreak of civil war in Syria in 2011 we have seen one of history’s largest movements of externally displaced individuals seeking asylum globally. Canada has become a destination for a great number of individuals claiming refugee protection from the threats or perceived threats they face at home. This work seeks to examine, through employing a critical media discourse analysis, the effect to which reporting on the issue of Syrian refugees in Canada within two national newspapers has contributed to either the further securitization or desecuritization of this issue.
Published: 2021

26. DeepMerge II: Building Robust Deep Learning Algorithms for Merging Galaxy Identification Across Domains

Author: Sandeep Madireddy, Gregory F. Snyder, K. Downey, A. Ćiprijanović, Brian Nord, Sydney Jenkins, Gabriel Perdue, Diana Kafkes, and Travis Johnston
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, FOS: Physical sciences, Context (language use), 02 engineering and technology, 01 natural sciences, Domain (software engineering), Machine Learning (cs.LG), 0103 physical sciences, Classifier (linguistics), 0202 electrical engineering, electronic engineering, information engineering, 010303 astronomy & astrophysics, Instrumentation and Methods for Astrophysics (astro-ph.IM), Physics, Artificial neural network, business.industry, Deep learning, Astronomy and Astrophysics, Astrophysics - Astrophysics of Galaxies, Identification (information), Artificial Intelligence (cs.AI), Space and Planetary Science, Astrophysics of Galaxies (astro-ph.GA), 020201 artificial intelligence & image processing, Artificial intelligence, business, Astrophysics - Instrumentation and Methods for Astrophysics, Algorithm
Abstract: In astronomy, neural networks are often trained on simulation data with the prospect of being used on telescope observations. Unfortunately, training a model on simulation data and then applying it to instrument data leads to a substantial and potentially even detrimental decrease in model accuracy on the new target dataset. Simulated and instrument data represent different data domains, and for an algorithm to work in both, domain-invariant learning is necessary. Here we employ domain adaptation techniques$-$ Maximum Mean Discrepancy (MMD) as an additional transfer loss and Domain Adversarial Neural Networks (DANNs)$-$ and demonstrate their viability to extract domain-invariant features within the astronomical context of classifying merging and non-merging galaxies. Additionally, we explore the use of Fisher loss and entropy minimization to enforce better in-domain class discriminability. We show that the addition of each domain adaptation technique improves the performance of a classifier when compared to conventional deep learning algorithms. We demonstrate this on two examples: between two Illustris-1 simulated datasets of distant merging galaxies, and between Illustris-1 simulated data of nearby merging galaxies and observed data from the Sloan Digital Sky Survey. The use of domain adaptation techniques in our experiments leads to an increase of target domain classification accuracy of up to ${\sim}20\%$. With further development, these techniques will allow astronomers to successfully implement neural network models trained on simulation data to efficiently detect and study astrophysical objects in current and future large-scale astronomical surveys., Submitted to MNRAS; 21 pages, 9 figures, 9 tables
Published: 2021

27. Domain Adaptation for Cross-Domain Studies of Merging Galaxies

Author: Diana Kafkes, Aleksandra Miodrag Ćiprijanović, Sydney Jenkins, Gabriel Perdue, Brian Nord, K. Downey, Travis Johnston, and Sandeep Madireddy
Subjects: Domain adaptation, Computer science, Galaxy merger, Algorithm, Domain (software engineering)
Published: 2021

28. Structure Prediction from Neutron Scattering Profiles: A Data Sciences Approach

Author: Thomas Proffen, Cristina Garcia-Cardona, Sudip K. Seal, Travis Johnston, and Ramakrishnan Kannan
Subjects: Class (set theory), Computer science, 02 engineering and technology, Neutron scattering, 021001 nanoscience & nanotechnology, 010403 inorganic & nuclear chemistry, 01 natural sciences, 0104 chemical sciences, Neutron spin echo, Data modeling, Set (abstract data type), Entropy (information theory), Neutron, Minification, 0210 nano-technology, Algorithm
Abstract: One of the main goals of neutron data analysis is to determine the internal structure of materials from their neutron scattering profiles. These structures are defined by a crystallographic class label and a set of real-valued parameters specific to that class. Existing structure analysis approaches use computationally expensive loop refinements methods that routinely take days, and even weeks, to complete. Additionally, the outcomes often rely on the fidelity of physical models that are computed during the refinement process.
Published: 2020

29. Low Size, Weight, and Power Neuromorphic Computing to Improve Combustion Engine Efficiency

Author: Bryan P. Maldonado, Derek C. Rose, Brian Kaul, Catherine D. Schuman, J. Parker Mitchell, Steven R. Young, and Travis Johnston
Subjects: Spiking neural network, Quantitative Biology::Neurons and Cognition, Artificial neural network, Computer science, business.industry, Pipeline (computing), Computer Science::Neural and Evolutionary Computation, computer.software_genre, Software framework, Computer Science::Hardware Architecture, Computer Science::Emerging Technologies, Software, Neuromorphic engineering, Engine efficiency, business, Field-programmable gate array, computer, Computer hardware
Abstract: Neuromorphic computing offers one path forward for AI at the edge. However, accessing and effectively utilizing a neuromorphic hardware platform is non-trivial. In this work, we present a complete pipeline for neuromorphic computing at the edge, including a small, inexpensive, low-power, FPGA-based neuromorphic hardware platform, a training algorithm for designing spiking neural networks for neuromorphic hardware, and a software framework for connecting those components. We demonstrate this pipeline on a real-world application, engine control for a spark-ignition internal combustion engine. We illustrate how we connect engine simulations with neuromorphic hardware simulations and training software to produce hardware-compatible spiking neural networks that perform engine control to improve fuel efficiency. We present initial results on the performance of these spiking neural networks and illustrate that they outperform open-loop engine control. We also give size, weight, and power estimates for a deployed solution of this type.
Published: 2020

30. Deep Reinforcement Learning for Residential HVAC Control with Consideration of Human Occupancy

Author: Evan McKee, Jeffrey D Munk, Kadir Amasyali, Travis Johnston, Yan Du, Fangxing Li, Helia Zandi, Kuldeep Kurte, and Olivera Kotevska
Subjects: Computer science, business.industry, 020209 energy, Online machine learning, 02 engineering and technology, 021001 nanoscience & nanotechnology, Hvac control, Reliability engineering, law.invention, Demand response, Smart grid, Control theory, Air conditioning, law, HVAC, Ventilation (architecture), 0202 electrical engineering, electronic engineering, information engineering, Reinforcement learning, 0210 nano-technology, business
Abstract: The Artificial Intelligence (AI) development described herein uses model-free Deep Reinforcement Learning (DRL) to minimize energy cost during residential heating, ventilation, and air conditioning (HVAC) operation. Building cooling loads and HVAC operation are difficult to accurately model due to complexity, lack of measurements and data, and model specific performance, so online machine learning is used to allow for real-time readjustment in performance. Energy costs for the multi-zone cooling unit shown in this work are minimized by scheduling on/off commands around dynamic prices. By taking advantage of precooling events that take place when the price is low, the agent is able to reduce operational cost without violating user comfort. The DRL controller was tested in simulation where the learner achieved a 43.89% cost reduction when compared to traditional, fixed-setpoint operation. The system is now ready for the next phase of testing in a live, real-time home environment.
Published: 2020

31. Resilience and Robustness of Spiking Neural Networks for Neuromorphic Systems

Author: Travis Johnston, Robert M. Patton, Prasanna Date, Bill Kay, Catherine D. Schuman, J. Parker Mitchell, and Maryam Parsa
Subjects: Spiking neural network, Artificial neural network, Computer science, Liquid state machine, business.industry, Reservoir computing, 02 engineering and technology, Memristor, 010501 environmental sciences, 01 natural sciences, 020202 computer hardware & architecture, law.invention, Neuromorphic engineering, Robustness (computer science), law, 0202 electrical engineering, electronic engineering, information engineering, Artificial intelligence, business, 0105 earth and related environmental sciences
Abstract: Though robustness and resilience are commonly quoted as features of neuromorphic computing systems, the expected performance of neuromorphic systems in the face of hardware failures is not clear. In this work, we study the effect of failures on the performance of four different training algo-rithms for spiking neural networks on neuromorphic systems: two back-propagation-based training approaches (Whetstone and SLAYER), a liquid state machine or reservoir computing approach, and an evolutionary optimization-based approach (EONS). We show that these four different approaches have very different resilience characteristics with respect to simulated hardware failures. We then analyze an approach for training more resilient spiking neural networks using the evolutionary optimization approach. We show how this approach produces more resilient networks and discuss how it can be extended to other spiking neural network training approaches as well.
Published: 2020

32. A survey of algorithms for transforming molecular dynamics data into metadata for

Author: Michela, Taufer, Trilce, Estrada, and Travis, Johnston
Subjects: Articles
Abstract: This paper presents the survey of three algorithms to transform atomic-level molecular snapshots from molecular dynamics (MD) simulations into metadata representations that are suitable for in situ analytics based on machine learning methods. MD simulations studying the classical time evolution of a molecular system at atomic resolution are widely recognized in the fields of chemistry, material sciences, molecular biology and drug design; these simulations are one of the most common simulations on supercomputers. Next-generation supercomputers will have a dramatically higher performance than current systems, generating more data that needs to be analysed (e.g. in terms of number and length of MD trajectories). In the future, the coordination of data generation and analysis can no longer rely on manual, centralized analysis traditionally performed after the simulation is completed or on current data representations that have been defined for traditional visualization tools. Powerful data preparation phases (i.e. phases in which original row data is transformed to concise and still meaningful representations) will need to proceed data analysis phases. Here, we discuss three algorithms for transforming traditionally used molecular representations into concise and meaningful metadata representations. The transformations can be performed locally. The new metadata can be fed into machine learning methods for runtime in situ analysis of larger MD trajectories supported by high-performance computing. In this paper, we provide an overview of the three algorithms and their use for three different applications: protein–ligand docking in drug design; protein folding simulations; and protein engineering based on analytics of protein functions depending on proteins' three-dimensional structures. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.
Published: 2020

33. Building High-throughput Neural Architecture Search Workflows via a Decoupled Fitness Prediction Engine

Author: Ariel Keller Rorabaugh, Silvina Caino-Lores, Travis Johnston, and Michela Taufer
Subjects: Computational Theory and Mathematics, Hardware and Architecture, Signal Processing
Published: 2022

34. Visualization System for Evolutionary Neural Networks for Deep Learning

Author: Thomas E. Potok, Catherine D. Schuman, Junghoon Chae, Steven R. Young, Robert M. Patton, Derek C. Rose, and Travis Johnston
Subjects: Visual analytics, Artificial neural network, business.industry, Computer science, Deep learning, Evolutionary algorithm, 0102 computer and information sciences, 02 engineering and technology, Machine learning, computer.software_genre, 01 natural sciences, Visualization, Evolving networks, 010201 computation theory & mathematics, Genetic algorithm, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, Interactive visualization
Abstract: Deep learning is actively used in a wide range of fields for scientific discovery. To effectively apply deep learning to a particular problem, it is important to select an appropriate network architecture and other hyper-parameters (at each layer). Evolving architectures and hyper-parameters using a genetic algorithm is one current approach to search the huge space of all possible configurations to find those more optimal for the problem. However, examining an evolutionary process and tuning the genetic algorithm are challenging, pushing most users to treat the process as a black box. To address this challenge, we propose a visualization system for evolutionary neural networks for deep learning. The key feature of our visualization system is to provide a visual analytics environment for evaluating a genetic algorithm in order to improve the underlying operations to reduce time to find good solutions. Our system is able to not only visualize how a genetic algorithm traverses its search space but also allows users to examine evolving networks in-depth to get insights to improve performance through interactive visualization components.
Published: 2019

35. Evolving Energy Efficient Convolutional Neural Networks

Author: Robert M. Patton, Derek C. Rose, Thomas E. Potok, Maryam Parsa, Bill Kay, Pravallika Devineni, Steven R. Young, Travis Johnston, and Catherine D. Schuman
Subjects: Artificial neural network, Edge device, business.industry, Computer science, Distributed computing, Deep learning, 02 engineering and technology, Energy consumption, 010501 environmental sciences, 01 natural sciences, Convolutional neural network, 020202 computer hardware & architecture, Network planning and design, Hyperparameter optimization, Genetic algorithm, 0202 electrical engineering, electronic engineering, information engineering, Artificial intelligence, business, 0105 earth and related environmental sciences, Efficient energy use
Abstract: As deep neural networks have been deployed in more and more applications over the past half decade and are finding their way into an ever increasing number of operational systems, their energy consumption becomes a concern whether running in the datacenter or on edge devices. Hyperparameter optimization and automated network design for deep learning is a quickly growing field, but much of the focus has remained only on optimizing for the performance of the machine learning task. In this work, we demonstrate that the best performing networks created through this automated network design process have radically different computational characteristics (e.g. energy usage, model size, inference time), presenting the opportunity to utilize this optimization process to make deep learning networks more energy efficient and deployable to smaller devices. Optimizing for these computational characteristics is critical as the number of applications of deep learning continues to expand.
Published: 2019

36. Exascale Deep Learning to Accelerate Cancer Research

Author: Catherine D. Schuman, Travis Johnston, Derek C. Rose, Shahira Abousamra, Seung-Hwan Lim, Le Hou, Dimitris Samaras, Thomas E. Potok, Junghoon Chae, Steven R. Young, Joel H. Saltz, and Robert M. Patton
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, 0303 health sciences, Speedup, Training set, Artificial neural network, Computer science, business.industry, Deep learning, Inference, Machine Learning (stat.ML), Machine learning, computer.software_genre, Machine Learning (cs.LG), 03 medical and health sciences, 0302 clinical medicine, Computer Science - Distributed, Parallel, and Cluster Computing, Statistics - Machine Learning, 030220 oncology & carcinogenesis, Benchmark (computing), Distributed, Parallel, and Cluster Computing (cs.DC), Artificial intelligence, business, computer, 030304 developmental biology
Abstract: Deep learning, through the use of neural networks, has demonstrated remarkable ability to automate many routine tasks when presented with sufficient data for training. The neural network architecture (e.g. number of layers, types of layers, connections between layers, etc.) plays a critical role in determining what, if anything, the neural network is able to learn from the training data. The trend for neural network architectures, especially those trained on ImageNet, has been to grow ever deeper and more complex. The result has been ever increasing accuracy on benchmark datasets with the cost of increased computational demands. In this paper we demonstrate that neural network architectures can be automatically generated, tailored for a specific application, with dual objectives: accuracy of prediction and speed of prediction. Using MENNDL--an HPC-enabled software stack for neural architecture search--we generate a neural network with comparable accuracy to state-of-the-art networks on a cancer pathology dataset that is also $16\times$ faster at inference. The speedup in inference is necessary because of the volume and velocity of cancer pathology data; specifically, the previous state-of-the-art networks are too slow for individual researchers without access to HPC systems to keep pace with the rate of data generation. Our new model enables researchers with modest computational resources to analyze newly generated data faster than it is collected., Comment: Submitted to IEEE Big Data
Published: 2019

37. In situ data analytics and indexing of protein trajectories

Author: Boyu Zhang, Silvia Crivelli, Adam Liwo, Michela Taufer, and Travis Johnston
Subjects: 0301 basic medicine, Theoretical computer science, 010304 chemical physics, Computer science, Data Science, Search engine indexing, Frame (networking), Proteins, General Chemistry, Models, Theoretical, Molecular Dynamics Simulation, Supercomputer, 01 natural sciences, Protein Structure, Secondary, Exascale computing, Computational science, Metadata, 03 medical and health sciences, Computational Mathematics, 030104 developmental biology, Face (geometry), 0103 physical sciences, Trajectory, Data analysis
Abstract: The transition toward exascale computing will be accompanied by a performance dichotomy. Computational peak performance will rapidly increase; I/O performance will either grow slowly or be completely stagnant. Essentially, the rate at which data are generated will grow much faster than the rate at which data can be read from and written to the disk. MD simulations will soon face the I/O problem of efficiently writing to and reading from disk on the next generation of supercomputers. This article targets MD simulations at the exascale and proposes a novel technique for in situ data analysis and indexing of MD trajectories. Our technique maps individual trajectories' substructures (i.e., α-helices, β-strands) to metadata frame by frame. The metadata captures the conformational properties of the substructures. The ensemble of metadata can be used for automatic, strategic analysis within a trajectory or across trajectories, without manually identify those portions of trajectories in which critical changes take place. We demonstrate our technique's effectiveness by applying it to 26.3k helices and 31.2k strands from 9917 PDB proteins and by providing three empirical case studies. © 2017 Wiley Periodicals, Inc.
Published: 2017

38. Multi-Objective Optimization for Size and Resilience of Spiking Neural Networks

Author: J. Parker Mitchell, Mihaela Dimovska, Catherine D. Schuman, Travis Johnston, and Thomas E. Potok
Subjects: FOS: Computer and information sciences, Spiking neural network, Computer Science - Machine Learning, Fitness function, Computer science, Distributed computing, 020208 electrical & electronic engineering, Evolutionary algorithm, Computer Science - Neural and Evolutionary Computing, Computer Science - Emerging Technologies, Fault tolerance, 02 engineering and technology, Multi-objective optimization, Machine Learning (cs.LG), 020202 computer hardware & architecture, Emerging Technologies (cs.ET), Neuromorphic engineering, 0202 electrical engineering, electronic engineering, information engineering, Neural and Evolutionary Computing (cs.NE)
Abstract: Inspired by the connectivity mechanisms in the brain, neuromorphic computing architectures model Spiking Neural Networks (SNNs) in silicon. As such, neuromorphic architectures are designed and developed with the goal of having small, low power chips that can perform control and machine learning tasks. However, the power consumption of the developed hardware can greatly depend on the size of the network that is being evaluated on the chip. Furthermore, the accuracy of a trained SNN that is evaluated on chip can change due to voltage and current variations in the hardware that perturb the learned weights of the network. While efforts are made on the hardware side to minimize those perturbations, a software based strategy to make the deployed networks more resilient can help further alleviate that issue. In this work, we study Spiking Neural Networks in two neuromorphic architecture implementations with the goal of decreasing their size, while at the same time increasing their resiliency to hardware faults. We leverage an evolutionary algorithm to train the SNNs and propose a multiobjective fitness function to optimize the size and resiliency of the SNN. We demonstrate that this strategy leads to well-performing, small-sized networks that are more resilient to hardware faults., Will appear in proceedings of 2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON). IEEE Catalog Number: CFP19G31-USB ISBN: 978-1-7281-3884-8 pg. 431-438
Published: 2019

39. A Novel Pruning Method for Convolutional Neural Networks Based off Identifying Critical Filters

Author: Travis Johnston and Mihaela Dimovska
Subjects: Class (computer programming), business.industry, Computer science, Inference, Artificial intelligence, Filter (signal processing), Pruning (decision trees), business, Machine learning, computer.software_genre, Convolutional neural network, computer
Abstract: Convolutional Neural Networks (CNNs) are one of the most extensively used tools in machine learning, but they are still not well understood and in many cases they are over-parameterized, leading to slow inference and impeding their deployment on low-power devices. In the last few years, many methods for decreasing the number of parameters in a network by pruning its output channels have been suggested, but a very recent work has argued that random pruning of channels performs on-par with state-of-the-art pruning methods. While random and other pruning methods might be effectively used for lowering the number of parameters in a CNN, none of these methods can be used to gain any further understanding of the model that the CNN has built. In this work, we propose a novel method for pruning a network, that at the same time can lead to a better understanding of what the individual filters of the network learn about the data. The method proposed aims to keep only the filters that are "important" for a class. We define a filter as important for a class if its removal has the highest negative impact on the accuracy for that class. We demonstrate that our method is better than random pruning on two networks used on the EMNIST and CIFAR10 datasets. By analyzing the important filters, we find that the important filters in the pruned networks learn features which are more general across classes. We demonstrate the importance and applicability of that observation in two transfer-learning tasks.
Published: 2019

40. Object recognition memory in zebrafish

Author: Adam Holcombe, Zacnicte May, Melike Schalomon, Trevor J. Hamilton, Travis Johnston, Joshua Gallup, Karim Fouad, and Adam Morrill
Subjects: Male, 0301 basic medicine, Nicotine, animal structures, genetic structures, Danio, 03 medical and health sciences, Behavioral Neuroscience, 0302 clinical medicine, medicine, Animals, Nicotinic Agonists, Zebrafish, Recognition memory, Communication, biology, business.industry, fungi, Neophobia, Cognitive neuroscience of visual object recognition, Recognition, Psychology, Memory retention, medicine.disease, biology.organism_classification, Object (computer science), Preference, 030104 developmental biology, Pattern Recognition, Visual, Female, business, Psychology, Neuroscience, 030217 neurology & neurosurgery
Abstract: The novel object recognition, or novel-object preference (NOP) test is employed to assess recognition memory in a variety of organisms. The subject is exposed to two identical objects, then after a delay, it is placed back in the original environment containing one of the original objects and a novel object. If the subject spends more time exploring one object, this can be interpreted as memory retention. To date, this test has not been fully explored in zebrafish (Danio rerio). Zebrafish possess recognition memory for simple 2- and 3-dimensional geometrical shapes, yet it is unknown if this translates to complex 3-dimensional objects. In this study we evaluated recognition memory in zebrafish using complex objects of different sizes. Contrary to rodents, zebrafish preferentially explored familiar over novel objects. Familiarity preference disappeared after delays of 5 mins. Leopard danios, another strain of D. rerio, also preferred the familiar object after a 1 min delay. Object preference could be re-established in zebra danios by administration of nicotine tartrate salt (50mg/L) prior to stimuli presentation, suggesting a memory-enhancing effect of nicotine. Additionally, exploration biases were present only when the objects were of intermediate size (2 × 5 cm). Our results demonstrate zebra and leopard danios have recognition memory, and that low nicotine doses can improve this memory type in zebra danios. However, exploration biases, from which memory is inferred, depend on object size. These findings suggest zebrafish ecology might influence object preference, as zebrafish neophobia could reflect natural anti-predatory behaviour.
Published: 2016

41. A survey of algorithms for transforming molecular dynamics data into metadata for in situ analytics based on machine learning methods

Author: Trilce Estrada, Michela Taufer, and Travis Johnston
Subjects: 020203 distributed computing, business.industry, Test data generation, Computer science, General Mathematics, General Engineering, General Physics and Astronomy, 010103 numerical & computational mathematics, 02 engineering and technology, Machine learning, computer.software_genre, 01 natural sciences, Visualization, Metadata, Molecular dynamics, Protein–ligand docking, Analytics, Atomic resolution, In situ analysis, 0202 electrical engineering, electronic engineering, information engineering, Artificial intelligence, 0101 mathematics, business, Algorithm, computer
Abstract: This paper presents the survey of three algorithms to transform atomic-level molecular snapshots from molecular dynamics (MD) simulations into metadata representations that are suitable for in situ analytics based on machine learning methods. MD simulations studying the classical time evolution of a molecular system at atomic resolution are widely recognized in the fields of chemistry, material sciences, molecular biology and drug design; these simulations are one of the most common simulations on supercomputers. Next-generation supercomputers will have a dramatically higher performance than current systems, generating more data that needs to be analysed (e.g. in terms of number and length of MD trajectories). In the future, the coordination of data generation and analysis can no longer rely on manual, centralized analysis traditionally performed after the simulation is completed or on current data representations that have been defined for traditional visualization tools. Powerful data preparation phases (i.e. phases in which original row data is transformed to concise and still meaningful representations) will need to proceed data analysis phases. Here, we discuss three algorithms for transforming traditionally used molecular representations into concise and meaningful metadata representations. The transformations can be performed locally. The new metadata can be fed into machine learning methods for runtime in situ analysis of larger MD trajectories supported by high-performance computing. In this paper, we provide an overview of the three algorithms and their use for three different applications: protein–ligand docking in drug design; protein folding simulations; and protein engineering based on analytics of protein functions depending on proteins' three-dimensional structures. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.
Published: 2020

42. 167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation

Author: Derek C. Rose, Thomas P. Karnowski, Seung-Hwan Lim, Robert M. Patton, Thomas E. Potok, Catherine D. Schuman, Don D. March, Maxim Ziatdinov, Steven R. Young, Travis Johnston, and Sergei V. Kalinin
Subjects: Artificial Intelligence System, Computer science, business.industry, Deep learning, 02 engineering and technology, 010402 general chemistry, 021001 nanoscience & nanotechnology, Supercomputer, Network topology, 01 natural sciences, Evolutionary computation, 0104 chemical sciences, Support vector machine, Computer engineering, Asynchronous communication, Artificial intelligence, 0210 nano-technology, business
Abstract: An artificial intelligence system called MENNDL, which used 25,200 NVIDIA Volta GPUs on Oak Ridge National Laboratory's Summit machine, automatically designed an optimal deep learning network in order to extract structural information from raw atomic-resolution microscopy data. In a few hours, MENNDL creates and evaluates millions of networks using a scalable, parallel, asynchronous genetic algorithm augmented with a support vector machine to automatically find a superior deep learning network topology and hyper-parameter set than a human expert can find in months. For the application of electron microscopy, the system furthers the goal of improving our understanding of the electron-beam-matter interactions and real-time image-based feedback, which enables a huge step beyond human capacity towards nanofabricating materials automatically. MENNDL has been scaled to the 4,200 available nodes of Summit achieving a measured 152.5 PFlops, with an estimated sustained performance of 167 PFlops when the entire machine is available.
Published: 2018

43. Cospectral mates for the union of some classes in the Johnson association scheme

Author: Sebastian M. Cioabă, Willem H. Haemers, Matt McGinnis, Travis Johnston, and Econometrics and Operations Research
Subjects: FOS: Computer and information sciences, Discrete Mathematics (cs.DM), 0102 computer and information sciences, 01 natural sciences, Graph, Combinatorics, Godsil–McKay switching, FOS: Mathematics, Mathematics - Combinatorics, Discrete Mathematics and Combinatorics, 0101 mathematics, Mathematics, Discrete mathematics, Numerical Analysis, Algebra and Number Theory, 010102 general mathematics, Kneser graph, Eigenvalues, Association scheme, 010201 computation theory & mathematics, Determined by spectrum, Combinatorics (math.CO), Geometry and Topology, Johnson association scheme, Computer Science - Discrete Mathematics
Abstract: Let $n\geq k\geq 2$ be two integers and $S$ a subset of $\{0,1,\dots,k-1\}$. The graph $J_{S}(n,k)$ has as vertices the $k$-subsets of the $n$-set $[n]=\{1,\dots,n\}$ and two $k$-subsets $A$ and $B$ are adjacent if $|A\cap B|\in S$. In this paper, we use Godsil-McKay switching to prove that for $m\geq 0$, $k\geq \max(m+2,3)$ and $S = \{0, 1, ..., m\}$, the graphs $J_S(3k-2m-1,k)$ are not determined by spectrum and for $m\geq 2$, $n\geq 4m+2$ and $S = \{0,1,...,m\}$ the graphs $J_{S}(n,2m+1)$ are not determined by spectrum. We also report some computational searches for Godsil-McKay switching sets in the union of classes in the Johnson scheme for $k\leq 5$., 9 pages, no figures, 3 tables; 2nd version contains improved results compared to the 1st version
Published: 2018

44. Boolean algebras and Lubell functions

Author: Kevin G. Milans, Linyuan Lu, and Travis Johnston
Subjects: Discrete mathematics, Boolean algebra (structure), Ramsey theory, Disjoint sets, Function (mathematics), Type (model theory), Power set, Theoretical Computer Science, 05D05, Combinatorics, symbols.namesake, Computational Theory and Mathematics, FOS: Mathematics, symbols, Mathematics - Combinatorics, Discrete Mathematics and Combinatorics, Combinatorics (math.CO), Family of sets, Ramsey's theorem, Mathematics
Abstract: Let $2^{[n]}$ denote the power set of $[n]:=\{1,2,..., n\}$. A collection $\B\subset 2^{[n]}$ forms a $d$-dimensional {\em Boolean algebra} if there exist pairwise disjoint sets $X_0, X_1,..., X_d \subseteq [n]$, all non-empty with perhaps the exception of $X_0$, so that $\B={X_0\cup \bigcup_{i\in I} X_i\colon I\subseteq [d]}$. Let $b(n,d)$ be the maximum cardinality of a family $\F\subset 2^X$ that does not contain a $d$-dimensional Boolean algebra. Gunderson, R\"odl, and Sidorenko proved that $b(n,d) \leq c_d n^{-1/2^d} \cdot 2^n$ where $c_d= 10^d 2^{-2^{1-d}}d^{d-2^{-d}}$. In this paper, we use the Lubell function as a new measurement for large families instead of cardinality. The Lubell value of a family of sets $\F$ with $\F\subseteq \tsupn$ is defined by $h_n(\F):=\sum_{F\in \F}1/{{n\choose |F|}}$. We prove the following Tur\'an type theorem. If $\F\subseteq 2^{[n]}$ contains no $d$-dimensional Boolean algebra, then $h_n(\F)\leq 2(n+1)^{1-2^{1-d}}$ for sufficiently large $n$. This results implies $b(n,d) \leq C n^{-1/2^d} \cdot 2^n$, where $C$ is an absolute constant independent of $n$ and $d$. As a consequence, we improve several Ramsey-type bounds on Boolean algebras. We also prove a canonical Ramsey theorem for Boolean algebras., Comment: 10 pages
Published: 2015

45. Optimizing Convolutional Neural Networks for Cloud Detection

Author: David Hughes, Steven R. Young, Devin A. White, Travis Johnston, and Robert M. Patton
Subjects: Sequence, 010504 meteorology & atmospheric sciences, Computer science, business.industry, Image quality, Deep learning, 0208 environmental biotechnology, 02 engineering and technology, Network topology, computer.software_genre, Machine learning, 01 natural sciences, Convolutional neural network, 020801 environmental engineering, Support vector machine, Random search, Data mining, Artificial intelligence, Layer (object-oriented design), business, computer, 0105 earth and related environmental sciences
Abstract: Deep convolutional neural networks (CNNs) have become extremely popular and successful at a number of machine learning tasks. One of the great challenges of successfully deploying a CNN is designing the network: specifying the network topology (sequence of layer types) and configuring the network (setting all the internal layer hyper-parameters). There are a number of techniques which are commonly used to design the network. One of the most successful is a simple (but lengthy) random search. In this paper we demonstrate how a random search can be dramatically improved by a two-phase search. The first phase is a traditional random search on n network configurations. The second phase exploits a support vector machine to guide a second random search on N network configurations. We apply this technique to a dataset containing satellite imagery and demonstrate that we can, with very high accuracy, identify regions containing clouds which obscure the landscape below.
Published: 2017

46. Evolving Deep Networks Using HPC

Author: Thomas E. Potok, Robert M. Patton, Thomas P. Karnowski, Jonathan Miller, Travis Johnston, Derek C. Rose, William T. Heller, Steven R. Young, and Gabriel Perdue
Subjects: 0301 basic medicine, Hyperparameter, Basis (linear algebra), Process (engineering), Computer science, business.industry, Scale (chemistry), Deep learning, Evolutionary algorithm, 02 engineering and technology, Supercomputer, Machine learning, computer.software_genre, 03 medical and health sciences, 030104 developmental biology, Hyperparameter optimization, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Data mining, business, computer
Abstract: While a large number of deep learning networks have been studied and published that produce outstanding results on natural image datasets, these datasets only make up a fraction of those to which deep learning can be applied. These datasets include text data, audio data, and arrays of sensors that have very different characteristics than natural images. As these "best" networks for natural images have been largely discovered through experimentation and cannot be proven optimal on some theoretical basis, there is no reason to believe that they are the optimal network for these drastically different datasets. Hyperparameter search is thus often a very important process when applying deep learning to a new problem. In this work we present an evolutionary approach to searching the possible space of network hyperparameters and construction that can scale to 18, 000 nodes. This approach is applied to datasets of varying types and characteristics where we demonstrate the ability to rapidly find best hyperparameters in order to enable practitioners to quickly iterate between idea and result.
Published: 2017
Full Text: View/download PDF

47. Creating a portable, high-level graph analytics paradigm for compute and data-intensive applications

Author: Michela Taufer, Travis Johnston, Robert Searles, Stephen Herbein, and Sunita Chandrasekaran
Subjects: 0209 industrial biotechnology, Theoretical computer science, Computer science, business.industry, Computer Networks and Communications, Subroutine, Distributed computing, Big data, Cloud computing, 02 engineering and technology, Supercomputer, CUDA, 020901 industrial engineering & automation, Hardware and Architecture, Scalability, 0202 electrical engineering, electronic engineering, information engineering, Leverage (statistics), Graph (abstract data type), 020201 artificial intelligence & image processing, business, Software
Abstract: HPC offers tremendous potential to process large amounts of data often termed as big data. Distributing data efficiently and leveraging specialised hardware (e.g., accelerators) are critical in order to best utilise HPC platforms constituting of heterogeneous and distributed systems. In this paper, we develop a portable, high-level paradigm for such systems to run big data applications, more specifically, graph analytics applications popular in the big data and machine learning communities. Using our paradigm, we accelerate three real-world, compute and data intensive, graph analytics applications: a function call graph similarity application, a triangle enumeration subroutine, and a graph assaying application. Our paradigm utilises the MapReduce framework, Apache Spark, in conjunction with CUDA and simultaneously takes advantage of automatic data distribution and accelerator on each node of the system. We demonstrate scalability and parameter space exploration and offer a portable solution to leverage almost any legacy, current, or next-generation HPC or cloud-based system.
Published: 2019

48. Development of a Scalable Method for Creating Food Groups Using the NHANES Dataset and MapReduce

Author: Michela Taufer, Travis Johnston, Mia A. Papas, and Michael R. Wyatt
Subjects: 2. Zero hunger, Engineering, Data processing, National Health and Nutrition Examination Survey, business.industry, digestive, oral, and skin physiology, 02 engineering and technology, computer.software_genre, Food group, 020204 information systems, Scalability, 0202 electrical engineering, electronic engineering, information engineering, Unsupervised learning, Preprocessor, 020201 artificial intelligence & image processing, Data mining, business, Cluster analysis, Raw data, computer, ComputingMilieux_MISCELLANEOUS
Abstract: In this paper we tackle the need for meaningful food group classifications in dietary datasets such as the National Health and Nutrition Examination Survey (NAHNES) that are less subjective in nature by defining a new objective method of identifying food groups exclusively based on the food's micro- and macro-nutrient content. We first perform extensive preprocessing of the NHANES raw data to mitigate impacts of missing nutrient values, redundancies, and different food intake quantities and scales. We then utilize an unsupervised learning clustering algorithm to create food groups within the preprocessed NHANES data and identify food groups with similar nutrient content. Finally we parallelize our method to benefit from the scalable MapReduce paradigm. Our results show that our method identifies food groups with smaller diameter and larger cluster separation distances than the standard, expert-informed, method of grouping food items.
Published: 2016

49. HYPPO: A Hybrid, Piecewise Polynomial Modeling Technique for Non-Smooth Surfaces

Author: Michela Taufer, Travis Johnston, and Connor Zanin
Subjects: Polynomial, Mathematical optimization, business.industry, Computer science, Cloud computing, 02 engineering and technology, 021001 nanoscience & nanotechnology, Regression, k-nearest neighbors algorithm, Resource (project management), Polynomial and rational function modeling, Linear regression, 0202 electrical engineering, electronic engineering, information engineering, Piecewise, 020201 artificial intelligence & image processing, 0210 nano-technology, business
Abstract: The number and diversity of tunable parameters in applications makes predicting settings that achieve optimal performance challenging. Complicating matters is the fact that resources are increasingly shared among computational tasks (for example, in cloud environments). Choosing any setting that yields near-optimal performance runs the risk of overusing shared resources. Building accurate models that capture the complicated interplay of parameters is crucial in order to maximize performance with minimal resource impact. Traditional techniques tend to fall short when modeling performance. One reason is that performance surfaces are often irregular but most traditional techniques are designed to produce smooth models. In this paper we introduce a hybrid modeling technique that combines the strengths of surrogate-based modeling (SBM) and k nearest-neighbor regression (kNN) into a single method called HYPPO. The hybrid method is a piecewise polynomial model composed of many small, local models. We demonstrate that HYPPO significantly improves overall prediction accuracy compared with SBM and kNN.
Published: 2016

50. On the Need for Reproducible Numerical Accuracy through Intelligent Runtime Selection of Reduction Algorithms at the Extreme Scale

Author: Michela Taufer, Travis Johnston, and Dylan Chapp
Subjects: Reduction (complexity), Propagation of uncertainty, Tree (data structure), Operator (computer programming), Computer science, Extreme scale, Sensitivity (control systems), Focus (optics), Algorithm, Selection (genetic algorithm), Power (physics)
Abstract: The inherent nondeterminism present in reduction operations on an exascale system, coupled with the nonassociativity of floating-point arithmetic, makes achieving reproducible results difficult or impossible. Work investigating the irreproducibility phenomenon has generally proceeded along one of two veins: (1) development of algorithms that produce reproducible numerical results irrespective of nondeterminism in the reduction tree and (2) study of the system-level factors that induce nondeterminism. Our work builds on the latter and unveils the power of mathematical methods to mitigate error propagation at the exascale. We focus on floating-point error accumulation over global summations where enforcing any reduction order is expensive or impossible. We model parallel summations with reduction trees and identify those parameters that can be used to estimate the reduction's sensitivity to variability in the reduction tree. We assess the impact of these parameters on the ability of different reduction methods to successfully mitigate errors. Our results illustrate the pressing need for intelligent runtime selection of reduction operators that ensure a given degree of reproducible accuracy.
Published: 2015

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

56 results on '"Travis Johnston"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources