Back to Search Start Over

Transfer learning graph representations of molecules for pKa, 13C-NMR, and solubility.

Authors :
El-Samman, A.M.
De Castro, S.
Morton, B.
De Baerdemacker, S.
Source :
Canadian Journal of Chemistry; 2024, Vol. 102 Issue 4, p275-288, 14p
Publication Year :
2024

Abstract

We explore transfer learning models from a pre-trained graph convolutional neural network representation of molecules, obtained from SchNet, to predict <superscript>13</superscript>C-NMR, pKa, and log S solubility. SchNet learns a graph representation of a molecule by associating each atom with an "embedding vector" and interacts the atom-embeddings with each other by leveraging graph convolutional filters on their interatomic distances. We pre-trained SchNet on molecular energy and demonstrate that the pre-trained atomistic embeddings can then be used as a transferable representation for a wide array of properties. On the one hand, for atomic properties such as micro-pK1 and <superscript>13</superscript>C-NMR, we investigate two models, one linear and one neural net, that input pre-trained atom-embeddings of a particular atom (e.g. carbon) and predict a local property (e.g., <superscript>13</superscript>C-NMR). On the other hand, for molecular properties such as solubility, a size-extensive graph model is built using the embeddings of all atoms in the molecule as input. For all cases, qualitatively correct predictions are made with relatively little training data (<1000 training points), showcasing the ease with which pre-trained embeddings pick up on important chemical patterns. The proposed models successfully capture well-understood trends of pK1 and solubility. This study advances our understanding of current neural net graph representations and their capacity for transfer learning applications in chemistry. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00084042
Volume :
102
Issue :
4
Database :
Complementary Index
Journal :
Canadian Journal of Chemistry
Publication Type :
Academic Journal
Accession number :
176331035
Full Text :
https://doi.org/10.1139/cjc-2023-0152