101. AceGPT, Localizing Large Language Models in Arabic
- Author
-
Huang, Huang, Yu, Fei, Zhu, Jianqing, Sun, Xuening, Cheng, Hao, Song, Dingjie, Chen, Zhihong, Alharthi, Abdulmohsen, An, Bang, He, Juncai, Liu, Ziche, Zhang, Zhiyi, Chen, Junying, Li, Jianquan, Wang, Benyou, Zhang, Lian, Sun, Ruoyu, Wan, Xiang, Li, Haizhou, and Xu, Jinchao
- Subjects
Computer Science - Computation and Language - Abstract
This paper is devoted to the development of a localized Large Language Model (LLM) specifically for Arabic, a language imbued with unique cultural characteristics inadequately addressed by current mainstream models. Significant concerns emerge when addressing cultural sensitivity and local values. To address this, the paper proposes a comprehensive solution that includes further pre-training with Arabic texts, Supervised Fine-Tuning (SFT) utilizing native Arabic instructions, and GPT-4 responses in Arabic, alongside Reinforcement Learning with AI Feedback (RLAIF) employing a reward model attuned to local culture and values. The goal is to cultivate culturally cognizant and value-aligned Arabic LLMs capable of accommodating the diverse, application-specific needs of Arabic-speaking communities. Comprehensive evaluations reveal that the resulting model, dubbed `AceGPT', sets the state-of-the-art standard for open Arabic LLMs across various benchmarks. Codes, data, and models are in https://github.com/FreedomIntelligence/AceGPT., Comment: Accepted to NAACL main conference. https://github.com/FreedomIntelligence/AceGPT
- Published
- 2023