Back to Search
Start Over
Efficient Approximate Maximum Inner Product Search over Sparse Vectors
- Publication Year :
- 2024
-
Abstract
- The maximum inner product search (MIPS) problem in high-dimensional vector spaces has various applications, primarily driven by the success of deep neural network-based embedding models. Existing MIPS methods designed for dense vectors using approximate techniques like locality-sensitive hashing (LSH) have been well studied, but they are not efficient and effective for searching sparse vectors due to the near-orthogonality among the sparse vectors. The solutions to MIPS over sparse vectors rely heavily on inverted lists, resulting in poor query efficiency, particularly when dealing with large-scale sparse datasets. In this paper, we introduce SOSIA, a novel framework specifically tailored to address these limitations. To handle sparsity, we propose the SOS transformation, which converts sparse vectors into a binary space while providing an unbiased estimator of the inner product between any two vectors. Additionally, we develop a minHash-based index to enhance query efficiency. We provide a theoretical analysis on the query quality of SOSIA and present extensive experiments on real-world sparse datasets to validate its effectiveness. The experimental results demonstrate its superior performance in terms of query efficiency and accuracy compared to existing methods.
Details
- Database :
- OAIster
- Notes :
- English
- Publication Type :
- Electronic Resource
- Accession number :
- edsoai.on1452723109
- Document Type :
- Electronic Resource