Back to Search Start Over

chatHPC: Empowering HPC users with large language models.

Authors :
Yin, Junqi
Hines, Jesse
Herron, Emily
Ghosal, Tirthankar
Liu, Hong
Prentice, Suzanne
Lama, Vanessa
Wang, Feiyi
Source :
Journal of Supercomputing. Jan2025, Vol. 81 Issue 1, p1-27. 27p.
Publication Year :
2025

Abstract

The ever-growing number of pre-trained large language models (LLMs) across scientific domains presents a challenge for application developers. While these models offer vast potential, fine-tuning them with custom data, aligning them for specific tasks, and evaluating their performance remain crucial steps for effective utilization. However, applying these techniques to models with tens of billions of parameters can take days or even weeks on modern workstations, making the cumulative cost of model comparison and evaluation a significant barrier to LLM-based application development. To address this challenge, we introduce an end-to-end pipeline specifically designed for building conversational and programmable AI agents on high performance computing (HPC) platforms. Our comprehensive pipeline encompasses: model pre-training, fine-tuning, web and API service deployment, along with crucial evaluations for lexical coherence, semantic accuracy, hallucination detection, and privacy considerations. We demonstrate our pipeline through the development of chatHPC, a chatbot for HPC question answering and script generation. Leveraging our scalable pipeline, we achieve end-to-end LLM alignment in under an hour on the Frontier supercomputer. We propose a novel self-improved, self-instruction method for instruction set generation, investigate scaling and fine-tuning strategies, and conduct a systematic evaluation of model performance. The established practices within chatHPC will serve as a valuable guidance for future LLM-based application development on HPC platforms. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09208542
Volume :
81
Issue :
1
Database :
Academic Search Index
Journal :
Journal of Supercomputing
Publication Type :
Academic Journal
Accession number :
181104824
Full Text :
https://doi.org/10.1007/s11227-024-06637-1