Back to Search Start Over

Performance of machine-learning scoring functions in structure-based virtual screening

Authors :
Pedro J. Ballester
Maciej Wójcikowski
Pawel Siedlecki
Institute of Biochemistry and Biophysics PAS, Ul. Pawinskiego 5A, 02-106 Warsaw
Institute of Biochemistry and Biophysics [Warsaw] (IBB)
Centre de Recherche en Cancérologie de Marseille (CRCM)
Aix Marseille Université (AMU)-Institut Paoli-Calmettes
Fédération nationale des Centres de lutte contre le Cancer (FNCLCC)-Fédération nationale des Centres de lutte contre le Cancer (FNCLCC)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)
MITOYAN, Louciné
Institute of Biochemistry and Biophysics
Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Institut Paoli-Calmettes
Fédération nationale des Centres de lutte contre le Cancer (FNCLCC)-Fédération nationale des Centres de lutte contre le Cancer (FNCLCC)-Aix Marseille Université (AMU)
Source :
Scientific Reports, Scientific Reports, 2017, 7, ⟨10.1038/srep46710⟩, Scientific Reports, Nature Publishing Group, 2017, 7, ⟨10.1038/srep46710⟩
Publication Year :
2017
Publisher :
HAL CCSD, 2017.

Abstract

Classical scoring functions have reached a plateau in their performance in virtual screening and binding affinity prediction. Recently, machine-learning scoring functions trained on protein-ligand complexes have shown great promise in small tailored studies. They have also raised controversy, specifically concerning model overfitting and applicability to novel targets. Here we provide a new ready-to-use scoring function (RF-Score-VS) trained on 15 426 active and 893 897 inactive molecules docked to a set of 102 targets. We use the full DUD-E data sets along with three docking tools, five classical and three machine-learning scoring functions for model building and performance assessment. Our results show RF-Score-VS can substantially improve virtual screening performance: RF-Score-VS top 1% provides 55.6% hit rate, whereas that of Vina only 16.2% (for smaller percent the difference is even more encouraging: RF-Score-VS top 0.1% achieves 88.6% hit rate for 27.5% using Vina). In addition, RF-Score-VS provides much better prediction of measured binding affinity than Vina (Pearson correlation of 0.56 and −0.18, respectively). Lastly, we test RF-Score-VS on an independent test set from the DEKOIS benchmark and observed comparable results. We provide full data sets to facilitate further research in this area (http://github.com/oddt/rfscorevs) as well as ready-to-use RF-Score-VS (http://github.com/oddt/rfscorevs_binary).

Details

Language :
English
ISSN :
20452322
Database :
OpenAIRE
Journal :
Scientific Reports, Scientific Reports, 2017, 7, ⟨10.1038/srep46710⟩, Scientific Reports, Nature Publishing Group, 2017, 7, ⟨10.1038/srep46710⟩
Accession number :
edsair.doi.dedup.....fcc7f816863c3f2a6ad22875f24a08c5
Full Text :
https://doi.org/10.1038/srep46710⟩