Back to Search
Start Over
TEII: Topic enhanced inverted index for top-k document retrieval.
- Source :
-
Knowledge-Based Systems . Nov2015, Vol. 89, p346-358. 13p. - Publication Year :
- 2015
-
Abstract
- In recent years, topic modeling is gaining significant momentum in information retrieval (IR). Researchers have found that utilizing the topic information generated through topic modeling together with traditional TF-IDF information generates superior results in document retrieval. However, in order to apply this idea to real-life IR systems, some critical problems need to be solved: how to store the topic information and how to utilize it with the TF-IDF information for efficient document retrieval. In this paper, we propose the Topic Enhanced Inverted Index (TEII) to incorporate the topic information into the inverted index for efficient top- k document retrieval. Specifically, we explore two different types of TEIIs. We first propose the incremental TEII, which includes the topic information into the traditional inverted index by adding topic-based inverted lists. The incremental TEII is beneficial for legacy IR systems, since it does not change the existing TF-IDF-based inverted lists. As a more flexible alternative, we propose the hybrid TEII to incorporate the topic information into each posting of the inverted index. In the hybrid TEII, two relaxation methods are proposed to support dynamic estimation of the upper bound impact of each posting. The hybrid TEII is highly extensible for incorporating different ranking factors and we show an extension of the hybrid TEII by considering the static quality of the documents in the corpus. Based on the incremental and hybrid TEIIs, we develop several query processing algorithms to support efficient top- k document retrieval on TEIIs. Empirical evaluation on the TREC dataset verifies the effectiveness and efficiency of the proposed index structures and query processing algorithms. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 09507051
- Volume :
- 89
- Database :
- Academic Search Index
- Journal :
- Knowledge-Based Systems
- Publication Type :
- Academic Journal
- Accession number :
- 110347820
- Full Text :
- https://doi.org/10.1016/j.knosys.2015.07.014