Back to Search Start Over

Ligand-based approaches to activity prediction for the early stage of structure–activity–relationship progression.

Authors :
Maeda, Itsuki
Sato, Akinori
Tamura, Shunsuke
Miyao, Tomoyuki
Source :
Journal of Computer-Aided Molecular Design. Mar2022, Vol. 36 Issue 3, p237-252. 16p.
Publication Year :
2022

Abstract

The retrospective evaluation of virtual screening approaches and activity prediction models are important for methodological development. However, for fair comparison, evaluation data sets must be carefully prepared. In this research, we compiled structure–activity–relationship matrix-based data sets for 15 biological targets along with many diverse inactive compounds, assuming the early stage of structure–activity–relationship progression. To use a large number of diverse inactive compounds and a limited number of active compounds, similarity profiles (SPs) are proposed as a set of molecular descriptors. Using these highly imbalanced data sets, we evaluated various approaches including SPs, under-sampling, support vector machine (SVM), and message passing neural networks. We found that for the under-sampling approaches, cluster-based sampling is better than random sampling. For virtual screening, SPs with inactive reference compounds and the under-sampling SVM also perform well. For classification, SPs with many inactive references performed as well as the under-sampling SVM trained on a balanced data set. Although the performance of SPs and the under-sampling SVM were comparable, SPs with many inactive references were preferable for selecting structurally distinct compounds from the active training compounds. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0920654X
Volume :
36
Issue :
3
Database :
Academic Search Index
Journal :
Journal of Computer-Aided Molecular Design
Publication Type :
Academic Journal
Accession number :
156222009
Full Text :
https://doi.org/10.1007/s10822-022-00449-2