Back to Search Start Over

Obtaining Parallel Sentences in Low-Resource Language Pairs with Minimal Supervision.

Authors :
Shi, Xiayang
Yue, Ping
Liu, Xinyi
Xu, Chun
Xu, Lin
Source :
Computational Intelligence & Neuroscience. 8/3/2022, p1-9. 9p.
Publication Year :
2022

Abstract

Machine translation relies on parallel sentences, the number of which is an important factor affecting the performance of machine translation systems, especially in low-resource languages. Recent advances in learning cross-lingual word representations from nonparallel data by machine learning make a new possibility for obtaining bilingual sentences with minimal supervision in low-resource languages. In this paper, we introduce a novel methodology to obtain parallel sentences via only a small-size bilingual seed lexicon about hundreds of entries. We first obtain bilingual semantic by establishing cross-lingual mapping in monolingual languages via a seed lexicon. Then, we construct a deep learning classifier to extract bilingual parallel sentences. We demonstrate the effectiveness of our methodology by harvesting Uyghur-Chinese parallel sentences and constructing a machine translation system. The experiments indicate that our method can obtain large and high-accuracy bilingual parallel sentences in low-resource language pairs. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
16875265
Database :
Academic Search Index
Journal :
Computational Intelligence & Neuroscience
Publication Type :
Academic Journal
Accession number :
158330732
Full Text :
https://doi.org/10.1155/2022/5296946