Back to Search
Start Over
Structured queries, language modeling, and relevance modeling in cross-language information retrieval
- Source :
- Information Processing & Management. 41:457-473
- Publication Year :
- 2005
- Publisher :
- Elsevier BV, 2005.
-
Abstract
- Two probabilistic approaches to cross-lingual retrieval are in wide use today, those based on probabilistic models of relevance, as exemplified by INQUERY, and those based on language modeling. INQUERY, as a query net model, allows the easy incorporation of query operators, including a synonym operator, which has proven to be extremely useful in cross-language information retrieval (CLIR), in an approach often called structured query translation. In contrast, language models incorporate translation probabilities into a unified framework. We compare the two approaches on Arabic and Spanish data sets, using two kinds of bilingual dictionaries-one derived from a conventional dictionary, and one derived from a parallel corpus. We find that structured query processing gives slightly better results when queries are not expanded. On the other hand, when queries are expanded, language modeling gives better results, but only when using a probabilistic dictionary derived from a parallel corpus.We pursue two additional issues inherent in the comparison of structured query processing with language modeling. The first concerns query expansion, and the second is the role of translation probabilities. We compare conventional expansion techniques (pseudo-relevance feedback) with relevance modeling, a new IR approach which fits into the formal framework of language modeling. We find that relevance modeling and pseudo-relevance feedback achieve comparable levels of retrieval and that good translalion probabilities confer a small but significant advantage.
- Subjects :
- Information retrieval
Modeling language
business.industry
Computer science
InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL
Library and Information Sciences
Management Science and Operations Research
computer.software_genre
Query language
Computer Science Applications
Query expansion
Media Technology
Data control language
Relevance (information retrieval)
Language model
Artificial intelligence
business
computer
Natural language processing
Cross-language information retrieval
Information Systems
RDF query language
computer.programming_language
Subjects
Details
- ISSN :
- 03064573
- Volume :
- 41
- Database :
- OpenAIRE
- Journal :
- Information Processing & Management
- Accession number :
- edsair.doi...........6b817f50b634ce9186b722c1eb60047e
- Full Text :
- https://doi.org/10.1016/j.ipm.2004.06.008