TY - GEN
T1 - CORE
T2 - 34th AAAI Conference on Artificial Intelligence, AAAI 2020
AU - Fu, Tianfan
AU - Xiao, Cao
AU - Sun, Jimeng
N1 - Publisher Copyright:
Copyright © 2020, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2020
Y1 - 2020
N2 - Molecule optimization is about generating molecule Y with more desirable properties based on an input molecule X. The state-of-the-art approaches partition the molecules into a large set of substructures S and grow the new molecule structure by iteratively predicting which substructure from S to add. However, since the set of available substructures S is large, such an iterative prediction task is often inaccurate especially for substructures that are infrequent in the training data. To address this challenge, we propose a new generating strategy called “Copy&Refine” (CORE), where at each step the generator first decides whether to copy an existing substructure from input X or to generate a new substructure, then the most promising substructure will be added to the new molecule. Combining together with scaffolding tree generation and adversarial training, CORE can significantly improve several latest molecule optimization methods in various measures including drug likeness (QED), dopamine receptor (DRD2) and penalized LogP. We tested CORE and baselines using the ZINC database and CORE obtained up to 11% and 21% relatively improvement over the baselines on success rate on the complete test set and the subset with infrequent substructures, respectively.
AB - Molecule optimization is about generating molecule Y with more desirable properties based on an input molecule X. The state-of-the-art approaches partition the molecules into a large set of substructures S and grow the new molecule structure by iteratively predicting which substructure from S to add. However, since the set of available substructures S is large, such an iterative prediction task is often inaccurate especially for substructures that are infrequent in the training data. To address this challenge, we propose a new generating strategy called “Copy&Refine” (CORE), where at each step the generator first decides whether to copy an existing substructure from input X or to generate a new substructure, then the most promising substructure will be added to the new molecule. Combining together with scaffolding tree generation and adversarial training, CORE can significantly improve several latest molecule optimization methods in various measures including drug likeness (QED), dopamine receptor (DRD2) and penalized LogP. We tested CORE and baselines using the ZINC database and CORE obtained up to 11% and 21% relatively improvement over the baselines on success rate on the complete test set and the subset with infrequent substructures, respectively.
UR - http://www.scopus.com/inward/record.url?scp=85095533896&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85095533896&partnerID=8YFLogxK
U2 - 10.1609/aaai.v34i01.5404
DO - 10.1609/aaai.v34i01.5404
M3 - Conference contribution
AN - SCOPUS:85095533896
T3 - AAAI 2020 - 34th AAAI Conference on Artificial Intelligence
SP - 638
EP - 645
BT - AAAI 2020 - 34th AAAI Conference on Artificial Intelligence
PB - American Association for Artificial Intelligence (AAAI) Press
Y2 - 7 February 2020 through 12 February 2020
ER -