Back to Search Start Over

When Subgraph Isomorphism is Really Hard, and Why This Matters for Graph Databases

Authors :
Patrick Prosser
Christine Solnon
Ciaran McCreesh
James Trimble
University of Glasgow
Geometry Processing and Constrained Optimization (M2DisCo)
Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS)
Institut National des Sciences Appliquées de Lyon (INSA Lyon)
Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Université de Lyon-Centre National de la Recherche Scientifique (CNRS)-Université Claude Bernard Lyon 1 (UCBL)
Université de Lyon-École Centrale de Lyon (ECL)
Université de Lyon-Université Lumière - Lyon 2 (UL2)-Institut National des Sciences Appliquées de Lyon (INSA Lyon)
Université de Lyon-Université Lumière - Lyon 2 (UL2)
ANR-13-BS02-0002,SoLStiCe,Similarités entre données localement structurées pour la vision par ordinateur(2013)
Source :
Journal of Artificial Intelligence Research, Journal of Artificial Intelligence Research, Association for the Advancement of Artificial Intelligence, 2018, 61, pp.723-759. ⟨10.1613/jair.5768⟩
Publication Year :
2018
Publisher :
HAL CCSD, 2018.

Abstract

International audience; The subgraph isomorphism problem involves deciding whether a copy of a pattern graph occurs inside a larger target graph. The non-induced version allows extra edges in the target, whilst the induced version does not. Although both variants are NP-complete, algorithms inspired by constraint programming can operate comfortably on many real-world problem instances with thousands of vertices. However, they cannot handle arbitrary instances of this size. We show how to generate " really hard " random instances for subgraph isomorphism problems, which are computationally challenging with a couple of hundred vertices in the target, and only twenty pattern vertices. For the non-induced version of the problem, these instances lie on a satisfiable / unsatisfiable phase transition, whose location we can predict; for the induced variant, much richer behaviour is observed, and constrained-ness gives a better measure of difficulty than does proximity to a phase transition. These results have practical consequences: we explain why the widely researched " filter / verify " indexing technique used in graph databases is founded upon a misunderstanding of the empirical hardness of NP-complete problems, and cannot be beneficial when paired with any reasonable subgraph isomorphism algorithm.

Details

Language :
English
ISSN :
10769757
Database :
OpenAIRE
Journal :
Journal of Artificial Intelligence Research, Journal of Artificial Intelligence Research, Association for the Advancement of Artificial Intelligence, 2018, 61, pp.723-759. ⟨10.1613/jair.5768⟩
Accession number :
edsair.doi.dedup.....9e7fdd0e4681add4d59310c43b8a4316
Full Text :
https://doi.org/10.1613/jair.5768⟩