Back to Search
Start Over
Where to stop reading a ranked list? Threshold optimization using truncated score distributions
- Source :
- Proceedings: 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval: SIGIR 2009, Boston, Massachusetts, July 19-23, 2009, 524-531, STARTPAGE=524;ENDPAGE=531;TITLE=Proceedings: 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval: SIGIR 2009, Boston, Massachusetts, July 19-23, 2009
- Publication Year :
- 2009
- Publisher :
- ACM Press, 2009.
-
Abstract
- Ranked retrieval has a particular disadvantage in comparison with traditional Boolean retrieval: there is no clear cut-off point where to stop consulting results. This is a serious problem in some setups. We investigate and further develop methods to select the rank cut-off value which optimizes a given effectiveness measure. Assuming no other input than a system's output for a query--document scores and their distribution--the task is essentially a score-distributional threshold optimization problem. The recent trend in modeling score distributions is to use a normal-exponential mixture: normal for relevant, and exponential for non-relevant document scores. We discuss the two main theoretical problems with the current model, support incompatibility and non-convexity, and develop new models that address them. The main contributions of the paper are two truncated normal-exponential models, varying in the way the out-truncated score ranges are handled. We conduct a range of experiments using the TREC 2007 and 2008 Legal Track data, and show that the truncated models lead to significantly better results.
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings: 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval: SIGIR 2009, Boston, Massachusetts, July 19-23, 2009, 524-531, STARTPAGE=524;ENDPAGE=531;TITLE=Proceedings: 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval: SIGIR 2009, Boston, Massachusetts, July 19-23, 2009
- Accession number :
- edsair.narcis........c20f4365081f8a13ec7e846cc816a1c6