Back to Search Start Over

THUIR at TREC 2009 Web Track: Finding Relevant and Diverse Results for Large Scale Web Search

Authors :
TSINGHUA UNIV BEIJING (CHINA) NATIONAL LAB FOR INFORMATION SCIENCE AND TECHNOLOGY
Li, Z. C.
Chen, F.
Xing, Q. L.
Miao, J. W.
Xue, Y. F.
Zhu, T.
Zhou, B.
Chen, R. W.
Liu, Y. Q.
Zhang, M.
Jin, Y.
Ma, S. P.
TSINGHUA UNIV BEIJING (CHINA) NATIONAL LAB FOR INFORMATION SCIENCE AND TECHNOLOGY
Li, Z. C.
Chen, F.
Xing, Q. L.
Miao, J. W.
Xue, Y. F.
Zhu, T.
Zhou, B.
Chen, R. W.
Liu, Y. Q.
Zhang, M.
Jin, Y.
Ma, S. P.
Source :
DTIC
Publication Year :
2009

Abstract

This is the 8th year that IR group of Tsinghua University (THUIR) participates in TREC. This year we focus on Web track, which contains two tasks, namely ad hoc and diversity. On ad hoc task, we improved the efficiency of our distributed retrieval system TMiner to handle terabytes of Web data. Then three studies have been done, namely page quality estimation, ranking feature analysis, and model comparison. On diversity task, we proposed several new approaches on searching strategy, user intention detection, and duplication elimination. To mine user's intention, we proposed and compared two different strategies, namely 'searching + content-based diversity' which is a kind of result clustering, and 'user based diverse intention prediction + searching' which is in the branch of query expansion.<br />Published in Proceedings of the Text REtrieval Conference (18th), TREC 2009, held in Gaithersburg, MD, 17-20 Nov 2009. The conference was co-sponsored by the National Institute of Standards and Technology (NIST) the Defense Advanced Research Projects Agency (DARPA) and the Advanced Research and Development Activity (ARDA). The original document contains color images.

Details

Database :
OAIster
Journal :
DTIC
Notes :
text/html, English
Publication Type :
Electronic Resource
Accession number :
edsoai.ocn832074104
Document Type :
Electronic Resource