Author: "Chen, Zhuofu" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Chen, Zhuofu"' showing total 3 results

Start Over Author "Chen, Zhuofu"

3 results on '"Chen, Zhuofu"'

1. TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention

Author: Yang, Lijie, Zhang, Zhihao, Chen, Zhuofu, Li, Zikun, and Jia, Zhihao
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Large language models (LLMs) have driven significant advancements across diverse NLP tasks, with long-context models gaining prominence for handling extended inputs. However, the expanding key-value (KV) cache size required by Transformer architectures intensifies the memory constraints, particularly during the decoding phase, creating a significant bottleneck. Existing sparse attention mechanisms designed to address this bottleneck have two limitations: (1) they often fail to reliably identify the most relevant tokens for attention, and (2) they overlook the spatial coherence of token selection across consecutive Transformer layers, which can lead to performance degradation and substantial overhead in token selection. This paper introduces TidalDecode, a simple yet effective algorithm and system for fast and accurate LLM decoding through position persistent sparse attention. TidalDecode leverages the spatial coherence of tokens selected by existing sparse attention methods and introduces a few token selection layers that perform full attention to identify the tokens with the highest attention scores, while all other layers perform sparse attention with the pre-selected tokens. This design enables TidalDecode to substantially reduce the overhead of token selection for sparse attention without sacrificing the quality of the generated results. Evaluation on a diverse set of LLMs and tasks shows that TidalDecode closely matches the generative performance of full attention methods while reducing the LLM decoding latency by up to 2.1x.
Published: 2024

2. Characterizing Network Requirements for GPU API Remoting in AI Applications

Author: Wang, Tianxia, Chen, Zhuofu, Wei, Xingda, Gu, Jinyu, Chen, Rong, and Chen, Haibo
Subjects: Computer Science - Operating Systems, Computer Science - Networking and Internet Architecture
Abstract: GPU remoting is a promising technique for supporting AI applications. Networking plays a key role in enabling remoting. However, for efficient remoting, the network requirements in terms of latency and bandwidth are unknown. In this paper, we take a GPU-centric approach to derive the minimum latency and bandwidth requirements for GPU remoting, while ensuring no (or little) performance degradation for AI applications. Our study including theoretical model demonstrates that, with careful remoting design, unmodified AI applications can run on the remoting setup using commodity networking hardware without any overhead or even with better performance, with low network demands.
Published: 2024

3. Characterization of Coal Combustion and Steam Temperature with Respect to Staged-Air Angle in a 600 MWe Down-Fired Boiler

Author: Kuang, Min, primary, Li, Zhengqi, additional, Ling, Zhongqian, additional, Chen, Zhuofu, additional, and Yuan, Danyan, additional
Published: 2014
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

3 results on '"Chen, Zhuofu"'

1. TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention

2. Characterizing Network Requirements for GPU API Remoting in AI Applications

3. Characterization of Coal Combustion and Steam Temperature with Respect to Staged-Air Angle in a 600 MWe Down-Fired Boiler

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Database

3 results on '"Chen, Zhuofu"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources