Search

Your search keyword '"Jia, Zhihao"' showing total 443 results

Search Constraints

Start Over You searched for: Author "Jia, Zhihao" Remove constraint Author: "Jia, Zhihao"
443 results on '"Jia, Zhihao"'

Search Results

1. MagicPIG: LSH Sampling for Efficient LLM Generation

2. TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention

3. Atlas: Hierarchical Partitioning for Quantum Circuit Simulation on GPUs (Extended Version)

4. GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism

5. SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices

6. Helix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUs

7. A Multi-Level Superoptimizer for Tensor Programs

8. Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances

9. FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning

10. Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding

11. Accelerating Retrieval-Augmented Language Model Serving with Speculation

12. Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems

13. Drone-NeRF: Efficient NeRF Based 3D Scene Reconstruction for Large-Scale Drone Survey

14. Quarl: A Learning-Based Quantum Circuit Optimizer

15. SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification

17. Research on Control System of Three-Phase Isolated AC/DC Converter

18. Quark: A Gradient-Free Quantum Learning Framework for Classification Tasks

19. OLLIE: Derivation-based Tensor Program Optimizer

20. BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs

21. Optimizing Mixture of Experts using Dynamic Recompilations

22. Bamboo: Making Preemptible Instances Resilient for Affordable Training of Large DNNs

23. Quartz: Superoptimization of Quantum Circuits (Extended Version)

24. TopoOpt: Co-optimizing Network Topology and Parallelization Strategy for Distributed Training Jobs

26. Quanto: Optimizing Quantum Circuits with Automatic Generation of Circuit Identities

28. Collage: Seamless Integration of Deep Learning Backends with Automatic Placement

29. TOD: GPU-accelerated Outlier Detection via Tensor Operations

30. GradSign: Model Performance Inference with Theoretical Insights

32. Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads

35. Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models

36. IOS: Inter-Operator Scheduler for CNN Acceleration

43. Redundancy-Free Computation Graphs for Graph Neural Networks

46. Beyond Data and Model Parallelism for Deep Neural Networks

47. Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks

Catalog

Books, media, physical & digital resources