Back to Search
Start Over
Obtaining Parallel Sentences in Low-Resource Language Pairs with Minimal Supervision.
- Source :
-
Computational Intelligence & Neuroscience . 8/3/2022, p1-9. 9p. - Publication Year :
- 2022
-
Abstract
- Machine translation relies on parallel sentences, the number of which is an important factor affecting the performance of machine translation systems, especially in low-resource languages. Recent advances in learning cross-lingual word representations from nonparallel data by machine learning make a new possibility for obtaining bilingual sentences with minimal supervision in low-resource languages. In this paper, we introduce a novel methodology to obtain parallel sentences via only a small-size bilingual seed lexicon about hundreds of entries. We first obtain bilingual semantic by establishing cross-lingual mapping in monolingual languages via a seed lexicon. Then, we construct a deep learning classifier to extract bilingual parallel sentences. We demonstrate the effectiveness of our methodology by harvesting Uyghur-Chinese parallel sentences and constructing a machine translation system. The experiments indicate that our method can obtain large and high-accuracy bilingual parallel sentences in low-resource language pairs. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 16875265
- Database :
- Academic Search Index
- Journal :
- Computational Intelligence & Neuroscience
- Publication Type :
- Academic Journal
- Accession number :
- 158330732
- Full Text :
- https://doi.org/10.1155/2022/5296946