Back to Search Start Over

A Learned Sketch for Subgraph Counting

Authors :
Yu Rong
Jeffrey Xu Yu
Kangfei Zhao
Hao Zhang
Qiyan Li
Source :
SIGMOD Conference
Publication Year :
2021
Publisher :
ACM, 2021.

Abstract

Subgraph counting, as a fundamental problem in network analysis, is to count the number of subgraphs in a data graph that match a given query graph by either homomorphism or subgraph isomorphism. The importance of subgraph counting derives from the fact that it provides insights of a large graph, in particular a labeled graph, when a collection of query graphs with different sizes and labels are issued. The problem of counting is challenging. On one hand, exact counting by enumerating subgraphs is NP-hard. % On the other hand, approximate counting by subgraph isomorphism can only support 3/5-node query graphs over unlabeled graphs. % Another way for subgraph counting is to specify it as an \SQL query and estimate the cardinality of the query in \rdbm. Existing approaches for cardinality estimation can only support subgraph counting by homomorphism up to some extent, as it is difficult to deal with sampling failure when a query graph becomes large. A question that arises is if subgraph counting can be supported by machine learning (ML) and deep learning (DL). The existing DL approach for subgraph isomorphism can only support small data graphs. The ML/DL approaches proposed in \rdbm context for approximate query processing and cardinality estimation cannot be used, as subgraph counting is to do complex self-joins over one relation, whereas existing approaches focus on multiple relations. In this paper, we propose an Active Learned Sketch for Subgraph Counting (\ALSS) with two main components: a sketch learned (ŁSS) and an active learner (\AL). The sketch is learned by a neural network regression model, and the active learner is to perform model updates based on new arrival test query graphs. % We conduct extensive experimental studies to confirm the effectiveness and efficiency of \ALSS using large real labeled graphs. Moreover, we show that \ALSS can assist query optimizers to find a better query plan for complex multi-way self-joins.

Details

Database :
OpenAIRE
Journal :
Proceedings of the 2021 International Conference on Management of Data
Accession number :
edsair.doi...........739e13299d9da5e8f93cc7346520a640
Full Text :
https://doi.org/10.1145/3448016.3457289