Author: "Kim, Hyesoon" / Publisher: ieee - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Kim, Hyesoon"' showing total 95 results

Start Over Author "Kim, Hyesoon" Publisher ieee

95 results on '"Kim, Hyesoon"'

1. Enabling Fine-Grained Incremental Builds by Making Compiler Stateful

Author: Han, Ruobing, primary, Zhao, Jisheng, additional, and Kim, Hyesoon, additional
Published: 2024
Full Text: View/download PDF

2. EHT-SR: An Entropy-Based Hybrid Approach for Faster Super-Resolution

Author: Dharmavarapu, Abhilash, primary, Petrangeli, Stefano, additional, Cao, Jiashen, additional, and Kim, Hyesoon, additional
Published: 2023
Full Text: View/download PDF

3. LCP: A Low-Communication Parallelization Method for Fast Neural Network Inference for IoT

Author: Hadidi, Ramyad, primary, Asgari, Bahar, additional, Cao, Jiashen, additional, Bae, Younmin, additional, Shim, Da Eun, additional, Kim, Hyojong, additional, Lim, Sung-Kyu, additional, Ryoo, Michael S., additional, and Kim, Hyesoon, additional
Published: 2023
Full Text: View/download PDF

4. Spica: Exploring FPGA Optimizations to Enable an Efficient SpMV Implementation for Computations at Edge

Author: Ramchandani, Dheeraj, primary, Asgari, Bahar, additional, and Kim, Hyesoon, additional
Published: 2023
Full Text: View/download PDF

5. Context-Aware Task Handling in Resource-Constrained Robots with Virtualization

Author: Hadidi, Ramyad, primary, Ghaleshahi, Nima Shoghi, additional, Asgari, Bahar, additional, and Kim, Hyesoon, additional
Published: 2023
Full Text: View/download PDF

6. Creating Robust Deep Neural Networks with Coded Distributed Computing for IoT

Author: Hadidi, Ramyad, primary, Cao, Jiashen, additional, Asgari, Bahar, additional, and Kim, Hyesoon, additional
Published: 2023
Full Text: View/download PDF

7. Reducing Inference Latency with Concurrent Architectures for Image Recognition at Edge

Author: Hadidi, Ramyad, primary, Cao, Jiashen, additional, Ryoo, Michael S., additional, and Kim, Hyesoon, additional
Published: 2023
Full Text: View/download PDF

8. Traversing Large Compressed Graphs on GPUs

Author: Gera, Prasun, primary and Kim, Hyesoon, additional
Published: 2023
Full Text: View/download PDF

9. Accelerating Graphic Rendering on Programmable RISC-V GPUs

Author: Tine, Blaise, primary, Saxena, Varun, additional, Srivatsan, Santosh, additional, Simpson, Joshua R., additional, Alzammar, Fadi, additional, Cooper, Liam Paul, additional, Jijina, Sam, additional, Rajagoplan, Swetha, additional, Kumar, Tejaswini Anand, additional, Young, Jeff, additional, and Kim, Hyesoon, additional
Published: 2022
Full Text: View/download PDF

10. Maia: Matrix Inversion Acceleration Near Memory

Author: Asgari, Bahar, primary, Ramchandani, Dheeraj, additional, Marfatia, Amaan, additional, and Kim, Hyesoon, additional
Published: 2022
Full Text: View/download PDF

11. RASA: Efficient Register-Aware Systolic Array Matrix Engine for CPU

Author: Jeong, Geonhwa, primary, Qin, Eric, additional, Samajdar, Ananda, additional, Hughes, Christopher J., additional, Subramoney, Sreenivas, additional, Kim, Hyesoon, additional, and Krishna, Tushar, additional
Published: 2021
Full Text: View/download PDF

12. Copernicus: Characterizing the Performance Implications of Compression Formats Used in Sparse Workloads

Author: Asgari, Bahar, primary, Hadidi, Ramyad, additional, Dierberger, Joshua, additional, Steinichen, Charlotte, additional, Marfatia, Amaan, additional, and Kim, Hyesoon, additional
Published: 2021
Full Text: View/download PDF

13. FAFNIR: Accelerating Sparse Gathering by Using Efficient Near-Memory Intelligent Reduction

Author: Asgari, Bahar, primary, Hadidi, Ramyad, additional, Cao, Jiashen, additional, Shim, Da Eun, additional, Lim, Sung-Kyu, additional, and Kim, Hyesoon, additional
Published: 2021
Full Text: View/download PDF

14. Hardware-based Always-On Heap Memory Safety

Author: Kim, Yonghae, primary, Lee, Jaekyu, additional, and Kim, Hyesoon, additional
Published: 2020
Full Text: View/download PDF

15. MEISSA: Multiplying Matrices Efficiently in a Scalable Systolic Architecture

Author: Asgari, Bahar, primary, Hadidi, Ramyad, additional, and Kim, Hyesoon, additional
Published: 2020
Full Text: View/download PDF

16. Understanding the Software and Hardware Stacks of a General-Purpose Cognitive Drone

Author: Jijina, Sam, primary, Amyette, Adriana, additional, Shoghi, Nima, additional, Hadidi, Ramyad, additional, and Kim, Hyesoon, additional
Published: 2020
Full Text: View/download PDF

17. Hot Chips 2020 Posters

Author: Elsabbagh, Fares, primary, Tine, Blaise, additional, Chawda, Apurve, additional, Gulian, Will, additional, Feng, Yaotian, additional, Shim, Da Eun, additional, Roshan, Priyadarshini, additional, Lyons, Ethan, additional, Zhu, Lingjun, additional, Lim, Sung Kyu, additional, and Kim, Hyesoon, additional
Published: 2020
Full Text: View/download PDF

18. RISC-V FPGA Platform Toward ROS-Based Robotics Application

Author: Lee, Jaewon, primary, Chen, Hanning, additional, Young, Jeffrey, additional, and Kim, Hyesoon, additional
Published: 2020
Full Text: View/download PDF

19. PISCES: Power-Aware Implementation of SLAM by Customizing Efficient Sparse Algebra

Author: Asgari, Bahar, primary, Hadidi, Ramyad, additional, Shoghi Ghaleshahi, Nima, additional, and Kim, Hyesoon, additional
Published: 2020
Full Text: View/download PDF

20. Proposing a Fast and Scalable Systolic Array for Matrix Multiplication

Author: Asgari, Bahar, primary, Hadidi, Ramyad, additional, and Kim, Hyesoon, additional
Published: 2020
Full Text: View/download PDF

21. ASCELLA: Accelerating Sparse Computation by Enabling Stream Accesses to Memory

Author: Asgari, Bahar, primary, Hadidi, Ramyad, additional, and Kim, Hyesoon, additional
Published: 2020
Full Text: View/download PDF

22. Tango: An Optimizing Compiler for Just-In-Time RTL Simulation

Author: Tine, Blaise-Pascal, primary, Yalamanchili, Sudhakar, additional, and Kim, Hyesoon, additional
Published: 2020
Full Text: View/download PDF

23. ALRESCHA: A Lightweight Reconfigurable Sparse-Computation Accelerator

Author: Asgari, Bahar, primary, Hadidi, Ramyad, additional, Krishna, Tushar, additional, Kim, Hyesoon, additional, and Yalamanchili, Sudhakar, additional
Published: 2020
Full Text: View/download PDF

24. Characterizing the Deployment of Deep Neural Networks on Commercial Edge Devices

Author: Hadidi, Ramyad, primary, Cao, Jiashen, additional, Xie, Yilun, additional, Asgari, Bahar, additional, Krishna, Tushar, additional, and Kim, Hyesoon, additional
Published: 2019
Full Text: View/download PDF

25. Capella: Customizing Perception for Edge Devices by Efficiently Allocating FPGAs to DNNs

Author: Bae, Younmin, primary, Hadidi, Ramyad, additional, Asgari, Bahar, additional, Cao, Jiashen, additional, and Kim, Hyesoon, additional
Published: 2019
Full Text: View/download PDF

26. POSTER: Tango: An Optimizing Compiler for Just-In-Time RTL Simulation

Author: Tine, Blaise-Pascal, primary, Yalamanchili, Sudhakar, additional, Kim, Hyesoon, additional, and Vetter, Jeff, additional
Published: 2019
Full Text: View/download PDF

27. Empirical Investigation of Stale Value Tolerance on Parallel RNN Training

Author: Hwan Lee, Joo, primary and Kim, Hyesoon, additional
Published: 2019
Full Text: View/download PDF

28. Efficiently Solving Partial Differential Equations in a Partially Reconfigurable Specialized Hardware.

Author: Asgari, Bahar, Hadidi, Ramyad, Krishna, Tushar, Kim, Hyesoon, and Yalamanchili, Sudhakar
Subjects: PARTIAL differential equations, GRAPHICS processing units, SUPERCOMPUTERS, WEATHER forecasting, GRAPH algorithms, VACCINE development, PHENOMENOLOGICAL theory (Physics)
Abstract: Scientific computations with a wide range of applications in domains such as developing vaccines, forecasting the weather, predicting natural disasters, simulating aerodynamics of spacecraft, and exploring oil resources, create the main workloads of supercomputers. The key integration of such scientific computations is modeling physical phenomena that are done with the aid of partial differential equations (PDEs). Solving PDEs on supercomputers, even with those equipped with GPUs, consumes a large amount of power and yet is not as fast as desired. The main reason behind such slow processing is data dependency. The key challenge is that software techniques cannot resolve these dependencies, therefore, such applications cannot benefit from the parallelism provided by processors such as GPUs. Our key insight to address this challenge is that although we cannot resolve the dependencies, we can reduce their negative impacts by using hardware/software co-optimization. To this end, we propose breaking down the data-dependent operations into two groups of operations: a majority of parallelizable and the minority of data-dependent operations. We execute these two groups in the desired order: first, we put together all parallelizable operations and execute them all, subsequently; then, we switch to execute the small data-dependent part. As long as the data-dependent part is small, we can accelerate them by using fast hardware mechanisms. Besides, our proposed hardware mechanisms guarantee quickly switching between the two groups of operations. To follow the same order of execution, dictated by our software mechanism, and implemented in hardware, we also propose a new low-overhead compression format – sparsity is another attribute of PDEs that require compression. Furthermore, the core generic architecture of our proposed hardware allows the execution of other applications including sparse matrix-vector multiplication (SpMV) and graph algorithms. The key feature of the proposed hardware is partial reconfigurability, which on one hand, facilitates the execution of data-dependent computations, and on the other hand, allows executing broad application without changing the entire configuration. Our evaluations show that compared to GPUs, we achieve an average speedup of 15.6× for scientific computations while consuming 14× less energy. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

29. Translating CUDA to OpenCL for Hardware Generation using Neural Machine Translation

Author: Kim, Yonghae, primary and Kim, Hyesoon, additional
Published: 2019
Full Text: View/download PDF

30. The 2019 Top Picks in Computer Architecture.

Author: Kim, Hyesoon
Subjects: *COMPUTER architecture, *TENSOR algebra, *DYNAMIC random access memory, *QUANTUM computing, *COMPUTER science, *QUANTUM computers
Published: 2020
Full Text: View/download PDF

31. CoolPIM: Thermal-Aware Source Throttling for Efficient PIM Instruction Offloading

Author: Nai, Lifeng, primary, Hadidi, Ramyad, additional, Xiao, He, additional, Kim, Hyojong, additional, Sim, Jaewoong, additional, and Kim, Hyesoon, additional
Published: 2018
Full Text: View/download PDF

32. Performance Characterisation and Simulation of Intel's Integrated GPU Architecture

Author: Gera, Prasun, primary, Kim, Hyojong, additional, Kim, Hyesoon, additional, Hong, Sunpyo, additional, George, Vinod, additional, and Luk, Chi-Keung, additional
Published: 2018
Full Text: View/download PDF

33. Performance Implications of NoCs on 3D-Stacked Memories: Insights from the Hybrid Memory Cube

Author: Hadidi, Ramyad, primary, Asgari, Bahar, additional, Young, Jeffrey, additional, Ahmad Mudassar, Burhan, additional, Garg, Kartikay, additional, Krishna, Tushar, additional, and Kim, Hyesoon, additional
Published: 2018
Full Text: View/download PDF

34. Demystifying the characteristics of 3D-stacked memories: A case study for Hybrid Memory Cube

Author: Hadidi, Ramyad, primary, Asgari, Bahar, additional, Mudassar, Burhan Ahmad, additional, Mukhopadhyay, Saibal, additional, Yalamanchili, Sudhakar, additional, and Kim, Hyesoon, additional
Published: 2017
Full Text: View/download PDF

35. SimProf: A Sampling Framework for Data Analytic Workloads

Author: Huang, Jen-Cheng, primary, Nai, Lifeng, additional, Kumar, Pranith, additional, Kim, Hyojong, additional, and Kim, Hyesoon, additional
Published: 2017
Full Text: View/download PDF

36. ERIDANUS: Efficiently Running Inference of DNNs Using Systolic Arrays.

Author: Asgari, Bahar, Hadidi, Ramyad, Kim, Hyesoon, and Yalamanchili, Sudhakar
Subjects: LINEAR algebra, SPARSE matrices
Abstract: Systolic arrays with promising attributes, such as high degree of concurrent computation and high data-reuse rate, are attractive solutions for dense linear algebra. Recently, systolic arrays have been used for accelerating the inference of deep neural networks (DNNs). However, as sparsification mechanisms are applied to DNNs during or after training, DNN inference is usually a sparse problem. Therefore, it cannot fully benefit from the fundamental advantages offered by systolic arrays. To solve this challenge, we propose Eridanus, an approach to structured pruning that produces DNNs compatible with the synchronous and rhythmic flow of data from memory to systolic arrays. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

37. GraphPIM: Enabling Instruction-Level PIM Offloading in Graph Computing Frameworks

Author: Nai, Lifeng, primary, Hadidi, Ramyad, additional, Sim, Jaewoong, additional, Kim, Hyojong, additional, Kumar, Pranith, additional, and Kim, Hyesoon, additional
Published: 2017
Full Text: View/download PDF

38. StaleLearn: Learning Acceleration with Asynchronous Synchronization Between Model Replicas on PIM.

Author: Lee, Joo Hwan and Kim, Hyesoon
Subjects: *GRAPHICS processing units, *MACHINE learning, *COMPUTER storage devices, *PARALLEL computers, *MICROPROCESSORS
Abstract: GPU has become popular with a large amount of parallelism found in learning. While the GPU has been effective for many learning tasks, still many GPU learning applications have low execution efficiency due to sparse data. Sparse data induces divergent memory accesses with low locality, thereby consuming a large fraction of execution time transferring data across the memory hierarchy. Although a considerable effort has been devoted to reducing the memory divergence, iterative-convergent learning provides a unique opportunity to achieve full potential in modern GPUs that it allows different threads to continue computation using stale values. In this paper, we propose StaleLearn, a learning acceleration mechanism to reduce the memory divergence overhead of GPU learning by utilizing the stale value tolerance of the iterative-convergent learning. Based on the stale value tolerance, StaleLearn transforms the problem of divergent memory accesses into the synchronization problem by replicating the model and reduces the synchronization overhead by asynchronous synchronization on Processor-in-Memory (PIM). The stale value tolerance enables a clear task decomposition between the GPU and PIM, which can effectively exploit parallelism between PIM and GPU. On average, our approach accelerates representative GPU learning applications by 3.17 times with existing PIM proposals. [ABSTRACT FROM PUBLISHER]
Published: 2018
Full Text: View/download PDF

39. BSSync: Processing Near Memory for Machine Learning Workloads with Bounded Staleness Consistency Models

Author: Lee, Joo Hwan, primary, Sim, Jaewoong, additional, and Kim, Hyesoon, additional
Published: 2015
Full Text: View/download PDF

40. GPUMech: GPU Performance Modeling Technique Based on Interval Analysis

Author: Huang, Jen-Cheng, primary, Lee, Joo Hwan, additional, Kim, Hyesoon, additional, and Lee, Hsien-Hsin S., additional
Published: 2014
Full Text: View/download PDF

41. Transparent Hardware Management of Stacked DRAM as Part of Memory

Author: Sim, Jaewoong, primary, Alameldeen, Alaa R., additional, Chishti, Zeshan, additional, Wilkerson, Chris, additional, and Kim, Hyesoon, additional
Published: 2014
Full Text: View/download PDF

42. Design Space Exploration of Memory Model for Heterogeneous Computing

Author: Lim, Jieun, primary and Kim, Hyesoon, additional
Published: 2014
Full Text: View/download PDF

43. Harmonica: An FPGA-Based Data Parallel Soft Core

Author: Kersey, Chad, primary, Yalamanchili, Sudhakar, additional, Kim, Hyojong, additional, Nigania, Nimit, additional, and Kim, Hyesoon, additional
Published: 2014
Full Text: View/download PDF

44. TBPoint: Reducing Simulation Time for Large-Scale GPGPU Kernels

Author: Huang, Jen-Cheng, primary, Nai, Lifeng, additional, Kim, Hyesoon, additional, and Lee, Hsien-Hsin S., additional
Published: 2014
Full Text: View/download PDF

45. Spare register aware prefetching for graph algorithms on GPUs

Author: Lakshminarayana, Nagesh B., primary and Kim, Hyesoon, additional
Published: 2014
Full Text: View/download PDF

46. CHiP: A Profiler to Measure the Effect of Cache Contention on Scalability

Author: Brett, Bevin, primary, Kumar, Pranith, additional, Kim, Minjang, additional, and Kim, Hyesoon, additional
Published: 2013
Full Text: View/download PDF

47. OpenCL Performance Evaluation on Modern Multi Core CPUs

Author: Lee, Joo Hwan, primary, Patel, Kaushik, additional, Nigania, Nimit, additional, Kim, Hyojong, additional, and Kim, Hyesoon, additional
Published: 2013
Full Text: View/download PDF

48. A Mostly-Clean DRAM Cache for Effective Hit Speculation and Self-Balancing Dispatch

Author: Sim, Jaewoong, primary, Loh, Gabriel H., additional, Kim, Hyesoon, additional, OConnor, Mike, additional, and Thottethodi, Mithuna, additional
Published: 2012
Full Text: View/download PDF

49. FLEXclusion: Balancing cache capacity and on-chip bandwidth via Flexible Exclusion

Author: Sim, Jaewoong, primary, Lee, Jaekyu, additional, Qureshi, Moinuddin K., additional, and Kim, Hyesoon, additional
Published: 2012
Full Text: View/download PDF

50. Predicting Potential Speedup of Serial Code via Lightweight Profiling and Emulations with Memory Performance Model

Author: Kim, Minjang, primary, Kumar, Pranith, additional, Kim, Hyesoon, additional, and Brett, Bevin, additional
Published: 2012
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

95 results on '"Kim, Hyesoon"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources