Back to Search
Start Over
AnchiBERT: A Pre-Trained Model for Ancient Chinese Language Understanding and Generation
- Source :
- IJCNN
- Publication Year :
- 2021
- Publisher :
- IEEE, 2021.
-
Abstract
- Ancient Chinese is the essence of Chinese culture. There are several natural language processing tasks of ancient Chinese domain, such as ancient-modern Chinese translation, poem generation, and couplet generation. Previous studies usually use the supervised models which deeply rely on parallel data. However, it is difficult to obtain large-scale parallel data of ancient Chinese. In order to make full use of the more easily available monolingual ancient Chinese corpora, we release An-chiBERT, a pre-trained language model based on the architecture of BERT, which is trained on large-scale ancient Chinese corpora. We evaluate AnchiBERT on both language understanding and generation tasks, including poem classification, ancient-modern Chinese translation, poem generation, and couplet generation. The experimental results show that AnchiBERT outperforms BERT as well as the non-pretrained models and achieves state-of - the-art results in all cases.
Details
- Database :
- OpenAIRE
- Journal :
- 2021 International Joint Conference on Neural Networks (IJCNN)
- Accession number :
- edsair.doi...........962b5b3423bff756bfff3e570cb5d85e
- Full Text :
- https://doi.org/10.1109/ijcnn52387.2021.9534342