Back to Search Start Over

Research Paper Search Using a Topic-Based Boolean Query Search and a General Query-Based Ranking Model

Authors :
Yoichi Tomiura
Emi Ishita
Satoshi Fukuda
Source :
Lecture Notes in Computer Science ISBN: 9783030276171, DEXA (2)
Publication Year :
2019
Publisher :
Springer Nature, 2019.

Abstract

When conducting a search for research papers, the search should return comprehensive results related to the user’s query. In general, a user inputs a Boolean query that reflects the information need, and the search engine ranks the research papers based on the query. However, it is difficult to anticipate all possible terms that authors of relevant papers might have used. Moreover, general query-based ranking methods emphasize how to rank the relevant documents at the top of the results, but require some means of guaranteeing the comprehensiveness of the results. Therefore, two ranking methods that consider the comprehensiveness of relevant papers are proposed. The first uses a topic-based Boolean query search. This search converts every word in the abstract set and query into a topic via topic analysis by Latent Dirichlet Allocation (LDA) and conducts a search at the topic level. The topic assigned to synonyms of a search term is expected to be the same as that assigned to the search term. Each paper is ranked based on the number of times it is matched with each topic-based Boolean query search executed for various LDA parameter settings. The second is a hybrid method that emphasizes better results from our topic-based ranking result and a general query-based ranking result. This method is based on the observation that the paper sets retrieved by our method and by a general ranking method will be different. Through experiments using the NTCIR-1 and -2 datasets, the effectiveness of our topic-based and hybrid methods are demonstrated.

Details

Language :
English
ISBN :
978-3-030-27617-1
ISBNs :
9783030276171
Volume :
11707
Database :
OpenAIRE
Journal :
Lecture Notes in Computer Science
Accession number :
edsair.doi.dedup.....978a47747e880d62f9fac40788a074bc