Back to Search Start Over

Reliably assessing prediction reliability for high dimensional QSAR data.

Authors :
Huang, Jianping
Fan, Xiaohui
Source :
Molecular Diversity; Feb2013, Vol. 17 Issue 1, p63-73, 11p
Publication Year :
2013

Abstract

Predictability and prediction reliability are of utmost important to characterize a good Quantitative structure-activity relationships (QSAR) model. However, validation methods are insufficient to guarantee the prediction reliability of QSAR models. Moreover, high dimensional samples also pose great challenge to traditional methods in terms of predictive power. Therefore, this study presents a predictive classifier (i.e., TreeEC) that can assess prediction reliability with high confidence, especially for facing high dimensional QSAR data. Two approaches for assessing prediction reliability are provided, i.e., applicability domain and prediction confidence. We demonstrate that the applicability domain has difficulty to guarantee the models' prediction reliability, where samples intensively close to the domain center are often poor predicted than those outside the domain. Instead, prediction confidence is more promising for assessing prediction reliability. Based on a large data set assessed by prediction confidence, external samples assessed with high confidence greater than 95 % can be reliably predicted with an accuracy of 94 %, in contrast to the average accuracy of 84 %. We also illustrate that TreeEC are less affected by high dimensionality than other popular methods according to 11 public data sets. A free version of TreeEC with a user-friendly interface can also be downloading from website . [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13811991
Volume :
17
Issue :
1
Database :
Complementary Index
Journal :
Molecular Diversity
Publication Type :
Academic Journal
Accession number :
85300669
Full Text :
https://doi.org/10.1007/s11030-012-9415-9