Back to Search
Start Over
A Learned Sketch for Subgraph Counting
- Source :
- SIGMOD Conference
- Publication Year :
- 2021
- Publisher :
- ACM, 2021.
-
Abstract
- Subgraph counting, as a fundamental problem in network analysis, is to count the number of subgraphs in a data graph that match a given query graph by either homomorphism or subgraph isomorphism. The importance of subgraph counting derives from the fact that it provides insights of a large graph, in particular a labeled graph, when a collection of query graphs with different sizes and labels are issued. The problem of counting is challenging. On one hand, exact counting by enumerating subgraphs is NP-hard. % On the other hand, approximate counting by subgraph isomorphism can only support 3/5-node query graphs over unlabeled graphs. % Another way for subgraph counting is to specify it as an \SQL query and estimate the cardinality of the query in \rdbm. Existing approaches for cardinality estimation can only support subgraph counting by homomorphism up to some extent, as it is difficult to deal with sampling failure when a query graph becomes large. A question that arises is if subgraph counting can be supported by machine learning (ML) and deep learning (DL). The existing DL approach for subgraph isomorphism can only support small data graphs. The ML/DL approaches proposed in \rdbm context for approximate query processing and cardinality estimation cannot be used, as subgraph counting is to do complex self-joins over one relation, whereas existing approaches focus on multiple relations. In this paper, we propose an Active Learned Sketch for Subgraph Counting (\ALSS) with two main components: a sketch learned (ŁSS) and an active learner (\AL). The sketch is learned by a neural network regression model, and the active learner is to perform model updates based on new arrival test query graphs. % We conduct extensive experimental studies to confirm the effectiveness and efficiency of \ALSS using large real labeled graphs. Moreover, we show that \ALSS can assist query optimizers to find a better query plan for complex multi-way self-joins.
- Subjects :
- Small data
Theoretical computer science
Relation (database)
Computer science
business.industry
Deep learning
Subgraph isomorphism problem
Context (language use)
Query plan
Homomorphism
Cardinality (SQL statements)
Artificial intelligence
business
Computer Science::Databases
MathematicsofComputing_DISCRETEMATHEMATICS
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the 2021 International Conference on Management of Data
- Accession number :
- edsair.doi...........739e13299d9da5e8f93cc7346520a640
- Full Text :
- https://doi.org/10.1145/3448016.3457289