Back to Search Start Over

Baichuan 2: Open Large-scale Language Models

Authors :
Yang, Aiyuan
Xiao, Bin
Wang, Bingning
Zhang, Borong
Bian, Ce
Yin, Chao
Lv, Chenxu
Pan, Da
Wang, Dian
Yan, Dong
Yang, Fan
Deng, Fei
Wang, Feng
Liu, Feng
Ai, Guangwei
Dong, Guosheng
Zhao, Haizhou
Xu, Hang
Sun, Haoze
Zhang, Hongda
Liu, Hui
Ji, Jiaming
Xie, Jian
Dai, JunTao
Fang, Kun
Su, Lei
Song, Liang
Liu, Lifeng
Ru, Liyun
Ma, Luyao
Wang, Mang
Liu, Mickel
Lin, MingAn
Nie, Nuolan
Guo, Peidong
Sun, Ruiyang
Zhang, Tao
Li, Tianpeng
Li, Tianyu
Cheng, Wei
Chen, Weipeng
Zeng, Xiangrong
Wang, Xiaochuan
Chen, Xiaoxi
Men, Xin
Yu, Xin
Pan, Xuehai
Shen, Yanjun
Wang, Yiding
Li, Yiyu
Jiang, Youxin
Gao, Yuchen
Zhang, Yupeng
Zhou, Zenan
Wu, Zhiying
Publication Year :
2023

Abstract

Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering. However, most powerful LLMs are closed-source or limited in their capability for languages other than English. In this technical report, we present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens. Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval. Furthermore, Baichuan 2 excels in vertical domains such as medicine and law. We will release all pre-training model checkpoints to benefit the research community in better understanding the training dynamics of Baichuan 2.<br />Comment: Baichuan 2 technical report. Github: https://github.com/baichuan-inc/Baichuan2

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2309.10305
Document Type :
Working Paper