Back to Search
Start Over
Result Diversification on Spatial, Multidimensional, Opinion, and Bibliographic Data
- Publication Year :
- 2013
-
Abstract
- Similarity search methods in the literature produce results based on the ranked degree of similarity to the query. However, the results are typically unsatisfactory, especially if there is an ambiguity in the query, or the search space include redundantly repeating similar documents. Diversity in query results is preferred by a variety of applications since diverse results may give a complete view of the queried topic. In this study, we investigate the result diversification task in various application areas, such as opinion retrieval, paper recommendation, with different types of data, such as spatial, high-dimensional data, opinions, citation graph, and other networks. Although the definitions of diversity will differ from field to field, we propose techniques considering the general objective of result diversification, which is to maximize the similarity of search results to the query while minimizing the pairwise similarity between the results, without neglecting the efficiency.For the diversity on spatial and high-dimensional data, we make an analogy with the concept of natural neighbors and propose geometric methods. We also introduce a diverse browsing method based on the popular distance browsing feature of R-tree index structures.Next, we focus on search and retrieval of opinion data on certain entities, and start our analysis by looking at direct correlations between sentiments of opinions and the demographics (e.g., gender, age, education level, etc.) of people that generate those opinions. Based on the analysis, we argue that opinion diversity can be achieved by diversifying the sources of opinions.Recommendation tasks on academic networks also suffer from the mentioned ambiguity and redundancy issues. To observe those effects, we present a paper recommendation framework called theadvisor (http://theadvisor.osu.edu) which recommends new papers to researchers using only the reference-citation relationships between academic papers. We introduce direction awareness property to the recommendation process, which allows the users to reach either old, foundational (possibly well-cited and well-known) research papers or recent (most likely less-known) ones. We also present different implementations and ordering techniques for reducing the query processing time. Finally, we enhance various result diversification techniques with direction-awareness property for paper recommendation, propose new algorithms based on vertex selection and query refinement, and compare in this domain.Based on our findings on diversifying citation recommendations, we further extend the diversity of graph-based recommendation algorithms for other types of graphs, such as social and collaboration networks, web and product co-purchasing graphs. Although the diversification problem is understandably addressed as a bi-criteria objective optimization problem over relevance and diversity, the sufficiency of the evaluations of such methods are questionable since a query-oblivious algorithm that returns most of its recommendations without considering the query may still perform the best on these commonly used measures. We show the deficiencies of commonly preferred evaluation techniques of diversification methods, propose a new measure called expanded relevance which combines relevance and diversity. Finally, we present a novel algorithm that optimizes the expanded relevance of the diversified results.
Details
- Language :
- English
- Database :
- OpenDissertations
- Publication Type :
- Dissertation/ Thesis
- Accession number :
- ddu.oai.etd.ohiolink.edu.osu1374148621