Back to Search
Start Over
A Transformation-Based Framework for KNN Set Similarity Search.
- Source :
-
IEEE Transactions on Knowledge & Data Engineering . Mar2020, Vol. 32 Issue 3, p409-423. 15p. - Publication Year :
- 2020
-
Abstract
- Set similarity search is a fundamental operation in a variety of applications. While many previous studies focus on threshold based set similarity search and join, few efforts have been paid for KNN set similarity search. In this paper, we propose a transformation based framework to solve the problem of KNN set similarity search, which given a collection of set records and a query set, returns $k$ k results with the largest similarity to the query. We devise an effective transformation mechanism to transform sets with various lengths to fixed length vectors which can map similar sets closer to each other. Then, we index such vectors with a tiny tree structure. Next, we propose efficient search algorithms and pruning strategies to perform exact KNN set similarity search. We also design an estimation technique by leveraging the data distribution to support approximate KNN search, which can speed up the search while retaining high recall. Experimental results on real world datasets show that our framework significantly outperforms state-of-the-art methods in both memory and disk based settings. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 10414347
- Volume :
- 32
- Issue :
- 3
- Database :
- Academic Search Index
- Journal :
- IEEE Transactions on Knowledge & Data Engineering
- Publication Type :
- Academic Journal
- Accession number :
- 141599657
- Full Text :
- https://doi.org/10.1109/TKDE.2018.2886189