Back to Search
Start Over
Cross-Lingual Passage Re-Ranking With Alignment Augmented Multilingual BERT
- Source :
- IEEE Access, Vol 8, Pp 213232-213243 (2020)
- Publication Year :
- 2020
- Publisher :
- IEEE, 2020.
-
Abstract
- The task of Cross-lingual Passage Re-ranking (XPR) aims to rank a list of candidate passages in multiple languages given a query, which is generally challenged by two main issues: (1) the query and passages to be ranked are often in different languages, which requires strong cross-lingual alignment, and (2) the lack of annotated data for model training and evaluation. In this article, we propose a two-stage approach to address these issues. At the first stage, we introduce the task of Cross-lingual Paraphrase Identification (XPI) as an extra pre-training to augment the alignment by leveraging a large unsupervised parallel corpus. This task aims to identify whether two sentences, which may be from different languages, have the same meaning. At the second stage, we introduce and compare three effective strategies for cross-lingual training. To verify the effectiveness of our method, we construct an XPR dataset by assembling and modifying two monolingual datasets. Experimental results show that our augmented pre-training contributes significantly to the XPR task. Besides, we directly transfer the trained model to test on out-domain data which are constructed by modifying three multi-lingual Question Answering (QA) datasets. The results demonstrate the cross-domain robustness of the proposed approach.
- Subjects :
- Cross lingual
General Computer Science
Computer science
InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL
02 engineering and technology
010501 environmental sciences
computer.software_genre
01 natural sciences
cross-lingual learning
Paraphrase
Data modeling
pre-training tasks
0202 electrical engineering, electronic engineering, information engineering
Question answering
General Materials Science
Passage re-ranking
0105 earth and related environmental sciences
business.industry
General Engineering
Ranking
Re ranking
Task analysis
020201 artificial intelligence & image processing
Artificial intelligence
lcsh:Electrical engineering. Electronics. Nuclear engineering
business
computer
lcsh:TK1-9971
Natural language processing
Subjects
Details
- Language :
- English
- ISSN :
- 21693536
- Volume :
- 8
- Database :
- OpenAIRE
- Journal :
- IEEE Access
- Accession number :
- edsair.doi.dedup.....19085ff263be10fd6fc3fd68e7bb22b9