1. WaterGPT: Training a Large Language Model to Become a Hydrology Expert
- Author
-
Yi Ren, Tianyi Zhang, Xurong Dong, Weibin Li, Zhiyang Wang, Jie He, Hanzhi Zhang, and Licheng Jiao
- Subjects
WaterGPT ,large language model ,agent ,prompt words ,Hydraulic engineering ,TC1-978 ,Water supply for domestic and industrial purposes ,TD201-500 - Abstract
This paper introduces WaterGPT, a language model designed for complex multimodal tasks in hydrology. WaterGPT is applied in three main areas: (1) processing and analyzing data such as images and text in water resources, (2) supporting intelligent decision-making for hydrological tasks, and (3) enabling interdisciplinary information integration and knowledge-based Q&A. The model has achieved promising results. One core aspect of WaterGPT involves the meticulous segmentation of training data for the supervised fine-tuning phase, sourced from real-world data and annotated with high quality using both manual methods and GPT-series model annotations. These data are carefully categorized into four types: knowledge-based, task-oriented, negative samples, and multi-turn dialogues. Additionally, another key component is the development of a multi-agent framework called Water_Agent, which enables WaterGPT to intelligently invoke various tools to solve complex tasks in the field of water resources. This framework handles multimodal data, including text and images, allowing for deep understanding and analysis of complex hydrological environments. Based on this framework, WaterGPT has achieved over a 90% success rate in tasks such as object detection and waterbody extraction. For the waterbody extraction task, using Dice and mIoU metrics, WaterGPT’s performance on high-resolution images from 2013 to 2022 has remained stable, with accuracy exceeding 90%. Moreover, we have constructed a high-quality water resources evaluation dataset, EvalWater, which covers 21 categories and approximately 10,000 questions. Using this dataset, WaterGPT achieved the highest accuracy to date in the field of water resources, reaching 83.09%, which is about 17.83 points higher than GPT-4.
- Published
- 2024
- Full Text
- View/download PDF