Back to Search Start Over

Sample size for the evaluation of presence-absence models.

Authors :
Jiménez-Valverde, Alberto
Source :
Ecological Indicators. Jul2020, Vol. 114, pN.PAG-N.PAG. 1p.
Publication Year :
2020

Abstract

• In species distribution modelling, the sample size of the testing dataset is crucial. • Thirty is the minimum sample size recommended. • Sensitivity of the point of equivalency and AUC show similar bias and precision. The effect of the training dataset sample size has been shown to have profound outcomes on the performance of species distribution models. However, the effects that the testing dataset sample size can have on the assessment of a models predictive capacity has received little attention. In this study, I used simulations to study how accurate two discrimination statics, the AUC (the area under the receiver operating characteristic – ROC – curve) and Se* (the probability of correctly classifying any case and calculated from the threshold that makes minimum the difference between sensitivity and specificity), are estimated based on sample size. ROC curves with known discrimination ability were simulated, samples were randomly taken, the two discrimination statistics were estimated, and the differences between the two estimators and their respective true values were computed to understand how bias and precision were affected by sample size. In general, as sample size increases, the difference between reported and true discrimination capacity decreased. There were no important differences between the estimated AUC and Se* statistics in terms of bias and precision. Under realistic scenarios where the ROC points are not necessarily part of the true underlying ROC curve, the two discrimination statistics are both unbiased and equally precise, and the higher the true discrimination capacity is, the more accurate they are estimated. Between 20 and 30 is a lowest sample size limit since below this interval accuracy estimates considerably decreases. All together, these results are very important since many interesting SDM applications involve rare and poorly known species for which sample sizes are unavoidably small. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
1470160X
Volume :
114
Database :
Academic Search Index
Journal :
Ecological Indicators
Publication Type :
Academic Journal
Accession number :
142869798
Full Text :
https://doi.org/10.1016/j.ecolind.2020.106289