Back to Search Start Over

Improving Statistical Word Alignments with Morpho-syntactic Transformations.

Authors :
Salakoski, Tapio
Ginter, Filip
Pyysalo, Sampo
Pahikkala, Tapio
Gispert, Adrià
Gupta, Deepa
Popović, Maja
Lambert, Patrik
Mariño, Jose B.
Federico, Marcello
Ney, Hermann
Banchs, Rafael
Source :
Advances in Natural Language Processing; 2006, p368-379, 12p
Publication Year :
2006

Abstract

This paper presents a wide range of statistical word alignment experiments incorporating morphosyntactic information. By means of parallel corpus transformations according to information of POS-tagging, lemmatization or stemming, we explore which linguistic information helps improve alignment error rates. For this, evaluation against a human word alignment reference is performed, aiming at an improved machine translation training scheme which eventually leads to improved SMT performance. Experiments are carried out in a Spanish-English European Parliament Proceedings parallel corpus, both in a large and a small data track. As expected, improvements due to introducing morphosyntactic information are bigger in case of data scarcity, but significant improvement is also achieved in a large data task, meaning that certain linguistic knowledge is relevant even in situations of large data availability. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISBNs :
9783540373346
Database :
Complementary Index
Journal :
Advances in Natural Language Processing
Publication Type :
Book
Accession number :
32883578
Full Text :
https://doi.org/10.1007/11816508_38