Abstract
Finding and enumerating common molecular substructures is an important task in cheminformatics, where small molecules are often modeled as molecular graphs. We introduce the problem of enumerating all maximal k-common molecular fragments of a pair of molecular graphs. A k-common fragment is a common connected induced subgraph that consists of a common core and a common k-neighborhood. It is thus a generalization of the NP-hard task to enumerate all maximal common connected induced subgraphs (MCCIS) of two graphs, which corresponds to the k = 0 case. We extend the MCCIS enumeration algorithm by Ina Koch and apply algorithm engineering techniques to solve practical instances fast for the general k > 0 case, which is relevant, for example, for automatically generating force field topologies for molecular dynamics (MD) simulations. We find that our methods achieve good performance on a real-world benchmark of all-against-all comparisons of 255 molecules. Our software is available under the LGPL open source license at https://github.com/enitram/mogli.
Original language | English (US) |
---|---|
DOIs | |
State | Published - 2017 |
Externally published | Yes |
Event | 2017 German Conference on Bioinformatics, GCB 2017 - Tubingen, Germany Duration: Sep 18 2017 → Sep 21 2017 |
Conference
Conference | 2017 German Conference on Bioinformatics, GCB 2017 |
---|---|
Country/Territory | Germany |
City | Tubingen |
Period | 9/18/17 → 9/21/17 |
Keywords
- molecular dynamics simulations
- molecular graphs
- subgraph enumeration
ASJC Scopus subject areas
- Information Systems
- Biomedical Engineering