Back to Search
Start Over
Optimization of binding affinities in chemical space with generative pre-trained transformer and deep reinforcement learning [version 2; peer review: 3 approved, 2 approved with reservations]
- Source :
- F1000Research. 12:757
- Publication Year :
- 2024
- Publisher :
- London, UK: F1000 Research Limited, 2024.
-
Abstract
- Background The key challenge in drug discovery is to discover novel compounds with desirable properties. Among the properties, binding affinity to a target is one of the prerequisites and usually evaluated by molecular docking or quantitative structure activity relationship (QSAR) models. Methods In this study, we developed SGPT-RL, which uses a generative pre-trained transformer (GPT) as the policy network of the reinforcement learning (RL) agent to optimize the binding affinity to a target. SGPT-RL was evaluated on the Moses distribution learning benchmark and two goal-directed generation tasks, with Dopamine Receptor D2 (DRD2) and Angiotensin-Converting Enzyme 2 (ACE2) as the targets. Both QSAR model and molecular docking were implemented as the optimization goals in the tasks. The popular Reinvent method was used as the baseline for comparison. Results The results on the Moses benchmark showed that SGPT-RL learned good property distributions and generated molecules with high validity and novelty. On the two goal-directed generation tasks, both SGPT-RL and Reinvent were able to generate valid molecules with improved target scores. The SGPT-RL method achieved better results than Reinvent on the ACE2 task, where molecular docking was used as the optimization goal. Further analysis shows that SGPT-RL learned conserved scaffold patterns during exploration. Conclusions The superior performance of SGPT-RL in the ACE2 task indicates that it can be applied to the virtual screening process where molecular docking is widely used as the criteria. Besides, the scaffold patterns learned by SGPT-RL during the exploration process can assist chemists to better design and discover novel lead candidates.
Details
- ISSN :
- 20461402
- Volume :
- 12
- Database :
- F1000Research
- Journal :
- F1000Research
- Notes :
- Revised Amendments from Version 1 Changes made from version 1 to version 2: The repetitive explanations of abbreviations in abstract, figure legends, and table legends were removed as mentioned by the reviewers. Included property distributions changes in Supplementary Figures into the Figures 3-4, to make the presentation clearer as mentioned by the reviewers. Updated the Supplementary Figures accordingly to support the changes in 2. Updated the source data reference to follow the update in 3. Corrected several typos and removed unnecessary sentences to make the context more fluent to read, as mentioned by the reviewers. Added descriptions to clarify the QSAR processing, as mentioned by a reviewer. Added a citation as suggested by a reviewer. Added descriptions to describe the formulation of the optimization as a RL problem. Added explanations of abbreviations in the figure and table captions to make them easier to read. Renamed subsections references to use names instead of numbers., , [version 2; peer review: 3 approved, 2 approved with reservations]
- Publication Type :
- Academic Journal
- Accession number :
- edsfor.10.12688.f1000research.130936.2
- Document Type :
- research-article
- Full Text :
- https://doi.org/10.12688/f1000research.130936.2