Back to Search Start Over

Pinyin-to-Chinese conversion on sentence-level for domain-specific applications using self-attention model

Authors :
Bingkun Wang
Shufeng Xiong
Li Ma
Ming Cheng
Source :
Multimedia Systems. 28:375-386
Publication Year :
2021
Publisher :
Springer Science and Business Media LLC, 2021.

Abstract

In the pinyin-based Chinese input method engine (IME), its performance depends mainly on the Pinyin-to-Chinese (P2C) conversion module. Traditional methods for P2C follow a pipeline procedure, which typically suffers from error propagation. Also, the ability to input the whole sentence of pinyin-based Chinese IME for domain-specific application needs to be improved. In this paper, we propose a neural self-attention model for Pinyin Sequence to Chinese Sequence (PS2CS) conversion method, which directly infers the entire Chinese sequence by feeding the unsegmented pinyin character sequence into. Our experimental results show that the proposed method outperforms baselines and the commercial IME on specific medical domain dataset, and also achieves comparable performance on the domain-general dataset.

Details

ISSN :
14321882 and 09424962
Volume :
28
Database :
OpenAIRE
Journal :
Multimedia Systems
Accession number :
edsair.doi...........c95046c36fdeafcd380499cde7c1947b