Back to Search
Start Over
Finding Sami Cognates with a Character-Based NMT Approach
- Source :
- Proceedings of the Workshop on Computational Methods for Endangered Languages.
- Publication Year :
- 2019
- Publisher :
- University of Colorado at Boulder, 2019.
-
Abstract
- We approach the problem of expanding the set of cognate relations with a sequence-to-sequence NMT model. The language pair of interest, Skolt Sami and North Sami, has too limited a set of parallel data for an NMT model as such. We solve this problem on the one hand, by training the model with North Sami cognates with other Uralic languages and, on the other, by generating more synthetic training data with an SMT model. The cognates found using our method are made publicly available in the Online Dictionary of Uralic Languages.
- Subjects :
- Training set
Computer science
business.industry
education
02 engineering and technology
010501 environmental sciences
computer.software_genre
01 natural sciences
Set (abstract data type)
Character (mathematics)
Online dictionary
0202 electrical engineering, electronic engineering, information engineering
6121 Languages
020201 artificial intelligence & image processing
Cognate
Artificial intelligence
business
computer
Natural language processing
0105 earth and related environmental sciences
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the Workshop on Computational Methods for Endangered Languages
- Accession number :
- edsair.doi.dedup.....38624283e4f56576c363e028ef7d05c1
- Full Text :
- https://doi.org/10.33011/computel.v1i.395