Enumerating common molecular substructures

Martin S. Engler, Mohammed El-Kebir, Jelmer Mulder, Alan E. Mark, Daan P. Geerke, Gunnar W. Klau

Research output: Contribution to conferencePaperpeer-review

Abstract

Finding and enumerating common molecular substructures is an important task in cheminformatics, where small molecules are often modeled as molecular graphs. We introduce the problem of enumerating all maximal k-common molecular fragments of a pair of molecular graphs. A k-common fragment is a common connected induced subgraph that consists of a common core and a common k-neighborhood. It is thus a generalization of the NP-hard task to enumerate all maximal common connected induced subgraphs (MCCIS) of two graphs, which corresponds to the k = 0 case. We extend the MCCIS enumeration algorithm by Ina Koch and apply algorithm engineering techniques to solve practical instances fast for the general k > 0 case, which is relevant, for example, for automatically generating force field topologies for molecular dynamics (MD) simulations. We find that our methods achieve good performance on a real-world benchmark of all-against-all comparisons of 255 molecules. Our software is available under the LGPL open source license at https://github.com/enitram/mogli.

Original languageEnglish (US)
DOIs
StatePublished - 2017
Externally publishedYes
Event2017 German Conference on Bioinformatics, GCB 2017 - Tubingen, Germany
Duration: Sep 18 2017Sep 21 2017

Conference

Conference2017 German Conference on Bioinformatics, GCB 2017
Country/TerritoryGermany
CityTubingen
Period9/18/179/21/17

Keywords

  • molecular dynamics simulations
  • molecular graphs
  • subgraph enumeration

ASJC Scopus subject areas

  • Information Systems
  • Biomedical Engineering

Fingerprint

Dive into the research topics of 'Enumerating common molecular substructures'. Together they form a unique fingerprint.

Cite this