Start Over

基于文本序列错误概率和中文拼写错误概率融合的汉语纠错算法.

Authors :: 孙哲
 禹可
 吴晓非
Source :: Application Research of Computers / Jisuanji Yingyong Yanjiu. Aug2023, Vol. 40 Issue 8, p2292-2297. 6p.
Publication Year :: 2023
Abstract: Chinese spelling error correction is a task to detect and correct spelling errors in text. Most Chinese spelling errors are the misuse of semantically, phonetically or morphologically similar characters, so it is common to extract features for mode-ling different modalities. However, the direct fusion of different features or summation using fixed weights prevent the model from learning in an efficient way by ignoring the importance relationship between the information of different modalities and the bias of the model in identifying errors. This paper proposed a new model to improve this problem, called the Chinese error correction algorithm based on the fusion of text sequence error probability and Chinese spelling error probability. The method used the text sequence error probability as the dynamic weight and the common Chinese spelling error probability as the fixed weight to efficiently fuse semantic, phonetic and morphologic information. The model was able to reasonably control the inflow of different modal information into the mixed modal representation and learnt more specifically where the errors occurred. Experiments conducted on the SIGHAN benchmark show that all evaluation scores of the proposed model are improved on different datasets, which validates the feasibility of the algorithm. [ABSTRACT FROM AUTHOR]