131 results on '"Chunyuan Zhang"'
Search Results
2. Automatic mapping and code optimization for OpenCL kernels on FT-matrix architecture (WIP paper).
3. SAI: Self-Adjusting Incremental Quantile Estimation for Sparse Training of Neural Networks on Hardware Accelerators.
4. Efficient Mini-batch Training for Echo State Networks.
5. Incremental Deployment of Programmable Switches for Sketch-based Network Measurement.
6. Towards High-Efficiency Data Centers via Job-Aware Network Scheduling.
7. HybridSketch: A Memory-centric Precise Approach for Flow Measurement.
8. Towards a Deep-Pipelined Architecture for Accelerating Deep GCN on a Multi-FPGA Platform.
9. Optimized HybridSketch: More Efficient with Analysis and Algorithm.
10. Towards Memory-Efficient Streaming Processing with Counter-Cascading Sketching on FPGA.
11. SACC: Configuring Application-Level Cache Intelligently for In-Memory Database Based on Long Short-Term Memory.
12. TBSW: Time-Based Sliding Window Algorithm for Network Traffic Measurement.
13. SWAP: a sliding window algorithm for in-network packet measurement.
14. KVSwitch: An In-network Load Balancer for Key-Value Stores.
15. Towards a Uniform Architecture for the Efficient Implementation of 2D and 3D Deconvolutional Neural Networks on FPGAs.
16. An Efficient Design Flow for Accelerating Complicated-connected CNNs on a Multi-FPGA Platform.
17. GENIE: QoS-guided Dynamic Scheduling for CNN-based Tasks on SME Clusters.
18. Multiple CNN-based Tasks Scheduling across Shared GPU Platform in Research and Development Scenarios.
19. High performance graph analytics with productivity on hybrid CPU-GPU platforms.
20. Towards a Multi-array Architecture for Accelerating Large-scale Matrix Multiplication on FPGAs.
21. Parallel programming course development based on parallel computational thinking.
22. Towards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA.
23. Winograd Algorithm for 3D Convolution Neural Networks.
24. Optimizing OpenCL Implementation of Deep Convolutional Neural Network on FPGA.
25. RVNet: A fast and high energy efficiency network packet processing system on RISC-V.
26. DCC: Distributed Cache Consistency.
27. Multikernel Recursive Least-Squares Temporal Difference Learning.
28. Enabling Tissue-Scale Cardiac Simulations Using Heterogeneous Computing on Tianhe-2.
29. Improve security and availability for cloud storage.
30. Scalable FPGA-based Architecture for High-Performance Per-Flow Traffic Measurement.
31. Poster Abstract: A Template-based Framework for Generating Network Processor in FPGA.
32. Poster Abstract: Deep Learning Workloads Scheduling with Reinforcement Learning on GPU Clusters.
33. Enable Scale and Aspect Ratio Adaptability in Visual Tracking with Detection Proposals.
34. Fast tracking via context depth model learning.
35. Automated Transformation of GPU-Specific OpenCL Kernels Targeting Performance Portability on Multi-Core/Many-Core CPUs.
36. A fault detection mechanism in a Data-flow scheduled Multithreaded processor.
37. Rethread: A Low-Cost Transient Fault Recovery Scheme for Multithreaded Processors.
38. Utilizing Multiple Xeon Phi Coprocessors on One Compute Node.
39. Accelerating 3D CNN-based Lung Nodule Segmentation on a Multi-FPGA System.
40. Scale-out Acceleration for 3D CNN-based Lung Nodule Segmentation on a Multi-FPGA System.
41. On-demand thread-level fault detection in a concurrent programming environment.
42. Automatic Mapping Single-Device OpenCL Program to Heterogeneous Multi-device Platform.
43. Solving the Cardiac Model Using Multi-core CPU and Many Integrated Cores (MIC).
44. Device View Redundancy: An Adaptive Low-Overhead Fault Tolerance Mechanism for Many-Core System.
45. An Adaptive Low-Overhead Mechanism for Dependable General-Purpose Many-Core Processors.
46. On the GPU-CPU Performance Portability of OpenCL for 3D Stencil Computations.
47. Performance of Sediment Transport Simulations on NVIDIA's Kepler Architecture.
48. On the GPU Performance of 3D Stencil Computations Implemented in OpenCL.
49. ACF: Networks-on-Chip Deadlock Recovery with Accurate Detection and Elastic Credit.
50. Parallelization Design of Irregular Algorithms of Video Processing on GPUs.
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.