1. Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models
- Author
-
Fang, Yin, Liang, Xiaozhuan, Zhang, Ningyu, Liu, Kangwei, Huang, Rui, Chen, Zhuo, Fan, Xiaohui, and Chen, Huajun
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Quantitative Biology - Quantitative Methods ,Computer Science - Information Retrieval ,Machine Learning (cs.LG) ,Computational Engineering, Finance, and Science (cs.CE) ,Artificial Intelligence (cs.AI) ,FOS: Biological sciences ,Computer Science - Computational Engineering, Finance, and Science ,Computation and Language (cs.CL) ,Quantitative Methods (q-bio.QM) ,Information Retrieval (cs.IR) - Abstract
Large Language Models (LLMs), with their remarkable task-handling capabilities and innovative outputs, have catalyzed significant advancements across a spectrum of fields. However, their proficiency within specialized domains such as biomolecular studies remains limited. To address this challenge, we introduce Mol-Instructions, a meticulously curated, comprehensive instruction dataset expressly designed for the biomolecular realm. Mol-Instructions is composed of three pivotal components: molecule-oriented instructions, protein-oriented instructions, and biomolecular text instructions, each curated to enhance the understanding and prediction capabilities of LLMs concerning biomolecular features and behaviors. Through extensive instruction tuning experiments on the representative LLM, we underscore the potency of Mol-Instructions to enhance the adaptability and cognitive acuity of large models within the complex sphere of biomolecular studies, thereby promoting advancements in the biomolecular research community. Mol-Instructions is made publicly accessible for future research endeavors and will be subjected to continual updates for enhanced applicability., Comment: Project homepage: https://github.com/zjunlp/Mol-Instructions
- Published
- 2023
- Full Text
- View/download PDF