Back to Search Start Over

Single-step retrosynthesis prediction by leveraging commonly preserved substructures

Authors :
Lei Fang
Junren Li
Ming Zhao
Li Tan
Jian-Guang Lou
Source :
Nature Communications. 14
Publication Year :
2023
Publisher :
Springer Science and Business Media LLC, 2023.

Abstract

Retrosynthesis analysis is an important task in organic chemistry with numerous industrial applications. Previously, machine learning approaches employing natural language processing techniques achieved promising results in this task by first representing reactant molecules as strings and subsequently predicting reactant molecules using text generation or machine translation models. Chemists cannot readily derive useful insights from traditional approaches that rely largely on atom-level decoding in the string representations, because human experts tend to interpret reactions by analyzing substructures that comprise a molecule. It is well-established that some substructures are stable and remain unchanged in reactions. In this paper, we developed a substructure-level decoding model, where commonly preserved portions of product molecules were automatically extracted with a fully data-driven approach. Our model achieves improvement over previously reported models, and we demonstrate that its performance can be boosted further by enhancing the accuracy of these substructures. Analyzing substructures extracted from our machine learning model can provide human experts with additional insights to assist decision-making in retrosynthesis analysis.

Details

ISSN :
20411723
Volume :
14
Database :
OpenAIRE
Journal :
Nature Communications
Accession number :
edsair.doi...........0cd59fd17a6263eaaaf06eb7e5bf0bfd
Full Text :
https://doi.org/10.1038/s41467-023-37969-w