Back to Search Start Over

Unsupervised manifold embedding to encode molecular quantum information for supervised learning of chemical data.

Authors :
Li, Tonglei
Huls, Nicholas J.
Lu, Shan
Hou, Peng
Source :
Communications Chemistry; 6/11/2024, Vol. 7 Issue 1, p1-16, 16p
Publication Year :
2024

Abstract

Molecular representation is critical in chemical machine learning. It governs the complexity of model development and the fulfillment of training data to avoid either over- or under-fitting. As electronic structures and associated attributes are the root cause for molecular interactions and their manifested properties, we have sought to examine the local electron information on a molecular manifold to understand and predict molecular interactions. Our efforts led to the development of a lower-dimensional representation of a molecular manifold, Manifold Embedding of Molecular Surface (MEMS), to embody surface electronic quantities. By treating a molecular surface as a manifold and computing its embeddings, the embedded electronic attributes retain the chemical intuition of molecular interactions. MEMS can be further featurized as input for chemical learning. Our solubility prediction with MEMS demonstrated the feasibility of both shallow and deep learning by neural networks, suggesting that MEMS is expressive and robust against dimensionality reduction. In machine learning of molecular properties, adequate molecular representation is crucial, as minor structural changes result in significant differences in the activity of interest. Here, the authors use a lower-dimensional representation of a molecular manifold to embed electronic attributes and retain the chemical intuition of molecular interactions, invariant to translational and rational degrees of freedom of the molecular surface, to ease model complexities and facilitate generalization even with relatively small datasets. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
23993669
Volume :
7
Issue :
1
Database :
Complementary Index
Journal :
Communications Chemistry
Publication Type :
Academic Journal
Accession number :
177816477
Full Text :
https://doi.org/10.1038/s42004-024-01217-z