Search

Your search keyword '"Venkataramani, Swagath"' showing total 169 results

Search Constraints

Start Over You searched for: Author "Venkataramani, Swagath" Remove constraint Author: "Venkataramani, Swagath"
169 results on '"Venkataramani, Swagath"'

Search Results

1. SmartQuant: CXL-based AI Model Store in Support of Runtime Configurable Weight Quantization

2. Enhance DNN Adversarial Robustness and Efficiency via Injecting Noise to Non-Essential Neurons

3. Approximate Computing and the Efficient Machine Learning Expedition

4. Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization

5. 4-bit Quantization of LSTM-based Speech Recognition Models

6. ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training

7. Workload-aware Automatic Parallelization for Multi-GPU DNN Training

8. Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)

9. PACT: Parameterized Clipping Activation for Quantized Neural Networks

10. SparCE: Sparsity aware General Purpose Core Extensions to Accelerate Deep Neural Networks

11. DyVEDeep: Dynamic Variable Effort Deep Neural Networks

13. Multiplier-less Artificial Neurons Exploiting Error Resiliency for Energy-Efficient Neural Computing

14. Energy-Efficient Object Detection using Semantic Decomposition

15. Exploring Spin-Transfer-Torque Devices for Logic Applications

18. 14.1 A Software-Assisted Peak Current Regulation Scheme to Improve Power-Limited Inference Performance in a 5nm AI SoC

25. Accelerating DNN Training Through Selective Localized Learning

26. A 7-nm Four-Core Mixed-Precision AI Chip With 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling

27. 4-Bit Quantization of LSTM-Based Speech Recognition Models

29. RaPiD: AI Accelerator for Ultra-low Precision Training and Inference

31. 9.1 A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware Throttling

33. Efficient AI System Design With Cross-Layer Approximate Computing

35. A 3.0 TFLOPS 0.62V Scalable Processor Core for High Compute Utilization AI Training and Inference

36. D y VED eep

39. DeepTools: Compiler and Execution Runtime Extensions for RaPiD AI Accelerator

41. BiScaled-DNN

46. A Scalable Multi-TeraOPS Core for AI Training and Inference

47. Approximate computing: An integrated cross-layer framework

48. DyVEDeep: Dynamic Variable Effort Deep Neural Networks.

49. Across the Stack Opportunities for Deep Learning Acceleration

50. Taming the beast

Catalog

Books, media, physical & digital resources