Back to Search Start Over

A novel approach for high-dimensional vector similarity join query.

Authors :
Ma, Youzhong
Jia, Shijie
Zhang, Yongxin
Source :
Concurrency & Computation: Practice & Experience; 3/10/2017, Vol. 29 Issue 5, pn/a-N.PAG, 12p
Publication Year :
2017

Abstract

SUMMARY In this paper, we mainly focus on similarity joins on massive high-dimensional vectors, it is a costly operation because of curse of dimensionality. The main idea of our proposed solution is to design a novel filtering method that can filter out as many vector pairs as possible at relative low cost. By using the good dimension reduction ability of piecewise aggregate approximation and symbolic aggregate approximation techniques, we proposed three novel approaches to deal with high dimensional-vector similarity join query: single-PAA-based vector similarity join query, multi-PAA-based vector similarity join query and SAX-based vector similarity join query. We conducted comprehensive experiments to test the performance of the above approaches, we also test the speedup ratio compared with the naive method: block nested loop join. The experimental results show that our approaches have much better performance than that of block nested loop join and also have good scalability. To the best of our knowledge, this is the first work to try to deal with high-dimensional vector similarity joins using piecewise aggregate approximation and symbolic aggregate approximation techniques, and the approaches proposed in this paper provide a new way to process the massive high-dimensional vector data set. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
15320626
Volume :
29
Issue :
5
Database :
Complementary Index
Journal :
Concurrency & Computation: Practice & Experience
Publication Type :
Academic Journal
Accession number :
121236080
Full Text :
https://doi.org/10.1002/cpe.3952