Back to Search Start Over

Where to stop reading a ranked list? Threshold optimization using truncated score distributions

Authors :
Arampatzis, A.
Kamps, J.
Robertson, S.
Sanderson, M.
Zhai, C.
Zobel, J.
Allan, J.
Aslam, J.A.
Language and Computation (ILLC, FNWI/FGw)
Source :
Proceedings: 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval: SIGIR 2009, Boston, Massachusetts, July 19-23, 2009, 524-531, STARTPAGE=524;ENDPAGE=531;TITLE=Proceedings: 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval: SIGIR 2009, Boston, Massachusetts, July 19-23, 2009
Publication Year :
2009
Publisher :
ACM Press, 2009.

Abstract

Ranked retrieval has a particular disadvantage in comparison with traditional Boolean retrieval: there is no clear cut-off point where to stop consulting results. This is a serious problem in some setups. We investigate and further develop methods to select the rank cut-off value which optimizes a given effectiveness measure. Assuming no other input than a system's output for a query--document scores and their distribution--the task is essentially a score-distributional threshold optimization problem. The recent trend in modeling score distributions is to use a normal-exponential mixture: normal for relevant, and exponential for non-relevant document scores. We discuss the two main theoretical problems with the current model, support incompatibility and non-convexity, and develop new models that address them. The main contributions of the paper are two truncated normal-exponential models, varying in the way the out-truncated score ranges are handled. We conduct a range of experiments using the TREC 2007 and 2008 Legal Track data, and show that the truncated models lead to significantly better results.

Details

Database :
OpenAIRE
Journal :
Proceedings: 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval: SIGIR 2009, Boston, Massachusetts, July 19-23, 2009, 524-531, STARTPAGE=524;ENDPAGE=531;TITLE=Proceedings: 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval: SIGIR 2009, Boston, Massachusetts, July 19-23, 2009
Accession number :
edsair.narcis........c20f4365081f8a13ec7e846cc816a1c6