652 results on '"Wen-mei W. Hwu"'
Search Results
2. HiCCL: A Hierarchical Collective Communication Library.
3. LSM-GNN: Large-scale Storage-based Multi-GPU GNN Training by Optimizing Data Transfer Scheme.
4. GPU-Initiated On-Demand High-Throughput Storage Access in the BaM System Architecture.
5. Parallelizing Maximal Clique Enumeration on GPUs.
6. FSSD: FPGA-Based Emulator for SSDs.
7. An efficient GPU implementation and scaling for higher-order 3D stencils.
8. Exploring HW/SW Co-Design for Video Analysis on CPU-FPGA Heterogeneous Systems.
9. MemXCT: Design, Optimization, Scaling, and Reproducibility of X-Ray Tomography Imaging.
10. RackBlox: A Software-Defined Rack-Scale Storage System with Network-Storage Co-Design.
11. IGB: Addressing The Gaps In Labeling, Features, Heterogeneity, and Size of Public Graph Datasets for Deep Learning Research.
12. CODAG: Characterizing and Optimizing Decompression Algorithms for GPUs.
13. PIGEON: Optimizing CUDA Code Generator for End-to-End Training and Inference of Relational Graph Neural Networks.
14. Accelerating Fourier and Number Theoretic Transforms using Tensor Cores and Warp Shuffles.
15. Mixed Precision Quantization for ReRAM-based DNN Inference Accelerators.
16. PhraseScope: An Effective and Unsupervised Framework for Mining High Quality Phrases.
17. Node-Aware Stencil Communication for Heterogeneous Supercomputers.
18. FReaC Cache: Folded-logic Reconfigurable Computing in the Last Level Cache.
19. Alleviating Semantic-level Shift: A Semi-supervised Domain Adaptation Method for Semantic Segmentation.
20. The Design and Implementation of a Scalable Deep Learning Benchmarking Platform.
21. EDD: Efficient Differentiable DNN Architecture and Implementation Co-search for Embedded AI Solutions.
22. Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture.
23. PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference.
24. FlatFlash: Exploiting the Byte-Accessibility of SSDs within a Unified Memory-Storage Hierarchy.
25. Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus.
26. MemXCT: memory-centric X-ray CT reconstruction with massive parallelization.
27. Accelerating Sparse Deep Neural Networks on FPGAs.
28. Update on k-truss Decomposition on GPU.
29. Update on Triangle Counting on GPU.
30. Analysis and Modeling of Collaborative Execution Strategies for Heterogeneous CPU-FPGA Architectures.
31. Near-Memory and In-Storage FPGA Acceleration for Emerging Cognitive Computing Workloads.
32. Accelerating reduction and scan using tensor core units.
33. A Compiler Framework for Optimizing Dynamic Parallelism on GPUs.
34. BaM: A Case for Enabling Fine-grain High Throughput GPU-Orchestrated Access to Storage.
35. Parallelizing Maximal Clique Enumeration on GPUs.
36. DLSpec: A Deep Learning Task Exchange Specification.
37. A Fast and Massively-Parallel Inverse Solver for Multiple-Scattering Tomographic Image Reconstruction.
38. Application-Transparent Near-Memory Processing Architecture with Memory Channel Network.
39. PyTorch-Direct: Enabling GPU Centric Data Access for Very Large Graph Neural Network Training with Irregular Accesses.
40. MLHarness: A Scalable Benchmarking System for MLCommons.
41. Graph Neural Network Training with Data Tiering.
42. K-Clique Counting on GPUs.
43. Open Relation Modeling: Learning to Define Relations between Entities.
44. Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach.
45. RAI: A Scalable Project Submission System for Parallel Programming Courses.
46. Revisiting Online Autotuning for Sparse-Matrix Vector Multiplication Kernels on Next-Generation Architectures.
47. Interpretable and Globally Optimal Prediction for Textual Grounding using Image Concepts.
48. Generalize or Die: Operating Systems Support for Memristor-Based Accelerators.
49. Rebooting the Data Access Hierarchy of Computing Systems.
50. Hardware Acceleration of the Pair-HMM Algorithm for DNA Variant Calling.
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.