151. A Concise Integer Linear Programming Formulation for Implicit Search Result Diversification
- Author
-
Roi Blanco, Adam Jatowt, Fajie Yuan, Hideo Joho, Long Chen, Haitao Yu, and Joemon M. Jose
- Subjects
Computer science ,business.industry ,Process (engineering) ,Diversification (finance) ,02 engineering and technology ,Machine learning ,computer.software_genre ,Ranking (information retrieval) ,Development (topology) ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,Key (cryptography) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Integer linear programming formulation ,computer ,Integer programming - Abstract
To cope with ambiguous and/or underspecified queries, search result diversification (SRD) is a key technique that has attracted a lot of attention. This paper focuses on implicit SRD, where the possible subtopics underlying a query are unknown beforehand. We formulate implicit SRD as a process of selecting and ranking k exemplar documents that utilizes integer linear programming (ILP). Unlike the common practice of relying on approximate methods, this formulation enables us to obtain the optimal solution of the objective function. Based on four benchmark collections, our extensive empirical experiments reveal that: (1) The factors, such as different initial runs, the number of input documents, query types and the ways of computing document similarity significantly affect the performance of diversification models. Careful examinations of these factors are highly recommended in the development of implicit SRD methods. (2) The proposed method can achieve substantially improved performance over the state-of-the-art unsupervised methods for implicit SRD.
- Published
- 2017
- Full Text
- View/download PDF