Back to Search Start Over

Selection of optimal validation methods for quantitative structure–activity relationships and applicability domain.

Authors :
Héberger, K.
Source :
SAR & QSAR in Environmental Research; May2023, Vol. 34 Issue 5, p415-434, 20p
Publication Year :
2023

Abstract

This brief literature survey groups the (numerical) validation methods and emphasizes the contradictions and confusion considering bias, variance and predictive performance. A multicriteria decision-making analysis has been made using the sum of absolute ranking differences (SRD), illustrated with five case studies (seven examples). SRD was applied to compare external and cross-validation techniques, indicators of predictive performance, and to select optimal methods to determine the applicability domain (AD). The ordering of model validation methods was in accordance with the sayings of original authors, but they are contradictory within each other, suggesting that any variant of cross-validation can be superior or inferior to other variants depending on the algorithm, data structure and circumstances applied. A simple fivefold cross-validation proved to be superior to the Bayesian Information Criterion in the vast majority of situations. It is simply not sufficient to test a numerical validation method in one situation only, even if it is a well defined one. SRD as a preferable multicriteria decision-making algorithm is suitable for tailoring the techniques for validation, and for the optimal determination of the applicability domain according to the dataset in question. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
1062936X
Volume :
34
Issue :
5
Database :
Complementary Index
Journal :
SAR & QSAR in Environmental Research
Publication Type :
Academic Journal
Accession number :
164084800
Full Text :
https://doi.org/10.1080/1062936X.2023.2214871