Back to Search Start Over

AnalyticDB-V

Authors :
Feifei Li
Cai Yuanzhe
Lou Renjie
Sheng Wang
Chuangxian Wei
Chaoqun Zhan
Bin Wu
Source :
Proceedings of the VLDB Endowment. 13:3152-3165
Publication Year :
2020
Publisher :
Association for Computing Machinery (ACM), 2020.

Abstract

With the explosive growth of unstructured data (such as images, videos, and audios), unstructured data analytics is widespread in a rich vein of real-world applications. Many database systems start to incorporate unstructured data analysis to meet such demands. However, queries over unstructured and structured data are often treated as disjoint tasks in most systems, where hybrid queries ( i.e. , involving both data types) are not yet fully supported. In this paper, we present a hybrid analytic engine developed at Alibaba, named AnalyticDB-V (ADBV), to fulfill such emerging demands. ADBV offers an interface that enables users to express hybrid queries using SQL semantics by converting unstructured data to high dimensional vectors. ADBV adopts the lambda framework and leverages the merits of approximate nearest neighbor search (ANNS) techniques to support hybrid data analytics. Moreover, a novel ANNS algorithm is proposed to improve the accuracy on large-scale vectors representing massive unstructured data. All ANNS algorithms are implemented as physical operators in ADBV, meanwhile, accuracy-aware cost-based optimization techniques are proposed to identify effective execution plans. Experimental results on both public and in-house datasets show the superior performance achieved by ADBV and its effectiveness. ADBV has been successfully deployed on Alibaba Cloud to provide hybrid query processing services for various real-world applications.

Details

ISSN :
21508097
Volume :
13
Database :
OpenAIRE
Journal :
Proceedings of the VLDB Endowment
Accession number :
edsair.doi...........ed5186fa7fc2783f0e9bea34b5aaa405