Back to Search Start Over

Enumerating common molecular substructures

Authors :
Daan P. Geerke
Mohammed El-Kebir
Alan E. Mark
Jelmer Mulder
Gunnar W. Klau
Martin S. Engler
Publication Year :
2017
Publisher :
PeerJ, 2017.

Abstract

Finding and enumerating common molecular substructures is an important task in cheminformatics, where small molecules are often modeled as molecular graphs. We introduce the problem of enumerating all maximal k-common molecular fragments of a pair of molecular graphs. A k-common fragment is a common connected induced subgraph that consists of a common core and a common k-neighborhood. It is thus a generalization of the NP-hard task to enumerate all maximal common connected induced subgraphs (MCCIS) of two graphs, which corresponds to the k = 0 case. We extend the MCCIS enumeration algorithm by Ina Koch and apply algorithm engineering techniques to solve practical instances fast for the general k > 0 case, which is relevant, for example, for automatically generating force field topologies for molecular dynamics (MD) simulations. We find that our methods achieve good performance on a real-world benchmark of all-against-all comparisons of 255 molecules. Our software is available under the LGPL open source license at https://github.com/enitram/mogli .

Details

Language :
English
Database :
OpenAIRE
Accession number :
edsair.doi.dedup.....c371c795d2ca2c4f52e465a84c1ff36a
Full Text :
https://doi.org/10.7287/peerj.preprints.3250