Author: "Huishuang Tian" / Database: OpenAIRE - Searchworks@Jio Institute Digital Library Search Results

Searchworks

Author: Dayiheng Liu, Jiancheng Lv, Huishuang Tian, and Kexin Yang
Subjects: Vocabulary, Poetry, business.industry, Computer science, media_common.quotation_subject, computer.software_genre, Chinese culture, Data modeling, Task analysis, Artificial intelligence, Couplet, Language model, Architecture, business, computer, Natural language processing, media_common
Abstract: Ancient Chinese is the essence of Chinese culture. There are several natural language processing tasks of ancient Chinese domain, such as ancient-modern Chinese translation, poem generation, and couplet generation. Previous studies usually use the supervised models which deeply rely on parallel data. However, it is difficult to obtain large-scale parallel data of ancient Chinese. In order to make full use of the more easily available monolingual ancient Chinese corpora, we release An-chiBERT, a pre-trained language model based on the architecture of BERT, which is trained on large-scale ancient Chinese corpora. We evaluate AnchiBERT on both language understanding and generation tasks, including poem classification, ancient-modern Chinese translation, poem generation, and couplet generation. The experimental results show that AnchiBERT outperforms BERT as well as the non-pretrained models and achieves state-of - the-art results in all cases.
Published: 2021
Full Text: View/download PDF

Searchworks