Back to Search
Start Over
Efficient access methods for very large distributed graph databases
- Source :
- Minerva: Repositorio Institucional de la Universidad de Santiago de Compostela, Universidad de Santiago de Compostela (USC), Minerva. Repositorio Institucional de la Universidad de Santiago de Compostela, instname
- Publication Year :
- 2021
- Publisher :
- Elsevier BV, 2021.
-
Abstract
- Subgraph searching is an essential problem in graph databases, but it is also challenging due to the involved subgraph isomorphism NP-Complete sub-problem. Filter-Then-Verify (FTV) methods mitigate performance overheads by using an index to prune out graphs that do not fit the query in a filtering stage, reducing the number of subgraph isomorphism evaluations in a subsequent verification stage. Subgraph searching has to be applied to very large databases (tens of millions of graphs) in real applications such as molecular substructure searching. Previous surveys have identified the FTV solutions GraphGrepSX (GGSX) and CT-Index as the best ones for large databases (thousands of graphs), however they cannot reach reasonable performance on very large ones (tens of millions graphs). This paper proposes a generic approach for the distributed implementation of FTV solutions. Besides, three previous methods that improve the performance of GGSX and CT-Index are adapted to be executed in clusters. The evaluation shows how the achieved solutions provide a great performance improvement (between 70% and 90% of filtering time reduction) in a centralized configuration and how they may be used to achieve efficient subgraph searching over very large databases in cluster configurations This work has been co-funded by the Ministerio de Economía y Competitividad of the Spanish government, and by Mestrelab Research S.L. through the project NEXTCHROM (RTC-2015-3812-2) of the call Retos-Colaboración of the program Programa Estatal de Investigación, Desarrollo e Innovación Orientada a los Retos de la Sociedad. The authors wish to thank the financial support provided by Xunta de Galicia under the Project ED431B 2018/28 SI
- Subjects :
- Graph databases
Information Systems and Management
Theoretical computer science
Graph query processing
Computer science
Subgraph isomorphism problem
Access method
02 engineering and technology
computer.software_genre
Theoretical Computer Science
Reduction (complexity)
Artificial Intelligence
Subgraph isomorphism
0202 electrical engineering, electronic engineering, information engineering
Subgraph search
Graph database
05 social sciences
050301 education
Graph indexing
Computer Science Applications
Index (publishing)
Control and Systems Engineering
Large scale processing
020201 artificial intelligence & image processing
Performance improvement
0503 education
computer
Software
MathematicsofComputing_DISCRETEMATHEMATICS
Subjects
Details
- ISSN :
- 00200255
- Volume :
- 573
- Database :
- OpenAIRE
- Journal :
- Information Sciences
- Accession number :
- edsair.doi.dedup.....3d6d6c90af69b792c42f7c1b55f8b40b
- Full Text :
- https://doi.org/10.1016/j.ins.2021.05.047