140 results on '"Tu, Fengbin"'
Search Results
2. Towards Efficient Control Flow Handling in Spatial Architecture via Architecting the Control Flow Plane
3. DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference
4. SWG: an architecture for sparse weight gradient computation
5. Alleviating Datapath Conflicts and Design Centralization in Graph Analytics Acceleration
6. H2Learn: High-Efficiency Learning Accelerator for High-Accuracy Spiking Neural Networks
7. Towards efficient generative AI and beyond-AI computing: New trends on ISSCC 2024 machine learning accelerators
8. 20.2 A 28nm 74.34TFLOPS/W BF16 Heterogenous CIM-Based Accelerator Exploiting Denoising-Similarity for Diffusion Models
9. 15.1 A 0.795fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch
10. AdaP-CIM: Compute-in-Memory Based Neural Network Accelerator using Adaptive Posit
11. Towards Efficient Control Flow Handling in Spatial Architecture via Architecting the Control Flow Plane
12. PIM-HLS: An Automatic Hardware Generation Tool for Heterogeneous Processing-In-Memory-based Neural Network Accelerators
13. AutoDCIM: An Automated Digital CIM Compiler
14. ECSSD: Hardware/Data Layout Co-Designed In-Storage-Computing Architecture for Extreme Classification
15. STAR: An STGCN ARchitecture for Skeleton-Based Human Action Recognition
16. SPG: Structure-Private Graph Database via SqueezePIR
17. Reconfigurability, Why It Matters in AI Tasks Processing: A Survey of Reconfigurable AI Chips
18. 16.1 MuITCIM: A 28nm $2.24 \mu\mathrm{J}$/Token Attention-Token-Bit Hybrid Sparse Digital CIM-Based Accelerator for Multimodal Transformers
19. 16.4 TensorCIM: A 28nm 3.7nJ/Gather and 8.3TFLOPS/W FP32 Digital-CIM Tensor Processor for MCM-CIM-Based Beyond-NN Acceleration
20. BIOS: A 40nm Bionic Sensor-defined 0.47pJ/SOP, 268.7TSOPs/W Configurable Spiking Neuron-in-Memory Processor for Wearable Healthcare
21. AutoDCIM: An Automated Digital CIM Compiler
22. PIM-HLS: An Automatic Hardware Generation Tool for Heterogeneous Processing-In-Memory-based Neural Network Accelerators
23. MulTCIM: Digital Computing-in-Memory-Based Multimodal Transformer Accelerator With Attention-Token-Bit Hybrid Sparsity
24. SDP: Co-Designing Algorithm, Dataflow, and Architecture for In-SRAM Sparse NN Acceleration
25. ECSSD: Hardware/Data Layout Co-Designed In-Storage-Computing Architecture for Extreme Classification
26. ReDCIM: Reconfigurable Digital Computing-in-Memory Processor with Unified FP/INT Pipeline for Cloud AI Acceleration
27. 16.4 TensorCIM: A 28nm 3.7nJ/Gather and 8.3TFLOPS/W FP32 Digital-CIM Tensor Processor for MCM-CIM-Based Beyond-NN Acceleration
28. SPCIM: Sparsity-Balanced Practical CIM Accelerator with Optimized Spatial-Temporal Multi-Macro Utilization
29. STAR: An STGCN ARchitecture for Skeleton-Based Human Action Recognition
30. Reconfigurability, Why It Matters in AI Tasks Processing: A Survey of Reconfigurable AI Chips
31. SPG: Structure-Private Graph Database via SqueezePIR
32. DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference
33. HDSuper: High-Quality and High Computational Utilization Edge Super-Resolution Accelerator With Hardware-Algorithm Co-Design Techniques
34. MulTCIM: Digital Computing-in-Memory-Based Multimodal Transformer Accelerator With Attention-Token-Bit Hybrid Sparsity
35. DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference
36. ReDCIM: Reconfigurable Digital Computing- In -Memory Processor With Unified FP/INT Pipeline for Cloud AI Acceleration
37. SDP: Co-Designing Algorithm, Dataflow, and Architecture for In-SRAM Sparse NN Acceleration
38. MulTCIM: Digital Computing-in-Memory-Based Multimodal Transformer Accelerator With Attention-Token-Bit Hybrid Sparsity
39. SPCIM: Sparsity-Balanced Practical CIM Accelerator With Optimized Spatial-Temporal Multi-Macro Utilization
40. H2Learn: High-Efficiency Learning Accelerator for High-Accuracy Spiking Neural Networks
41. GQNA: Generic Quantized DNN Accelerator With Weight-Repetition-Aware Activation Aggregating
42. Alleviating datapath conflicts and design centralization in graph analytics acceleration
43. INSPIRE
44. A 28nm 29.2TFLOPS/W BF16 and 36.5TOPS/W INT8 Reconfigurable Digital CIM Processor with Unified FP/INT Pipeline and Bitwise In-Memory Booth Multiplication for Cloud Deep Learning Acceleration
45. INSPIRE: IN-Storage Private Information REtrieval via Protocol and Architecture Co-design
46. Accelerating Spatiotemporal Supervised Training of Large-Scale Spiking Neural Networks on GPU
47. H2Learn: High-Efficiency Learning Accelerator for High-Accuracy Spiking Neural Networks
48. DOTA: Detect and OmitWeak Attentions for Scalable Transformer Acceleration
49. Dynamic Sparse Attention for Scalable Transformer Acceleration
50. A 28nm 15.59J/Token Full-Digital Bitline-Transpose CIM-Based Sparse Transformer Accelerator with Pipeline/Parallel Reconfigurable Modes
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.