1. Features and performance of some outlier detection methods
- Author
-
Gianfranco Genta, Emanuele Modesto Barini, Raffaello Levi, and Giulio Barbato
- Subjects
Statistics and Probability ,Order statistic ,Robust statistics ,Experimental data ,computer.software_genre ,Sample size determination ,Outlier ,Anomaly detection ,Data mining ,Statistics, Probability and Uncertainty ,Statistical theory ,computer ,Mathematics ,Statistical hypothesis testing - Abstract
A review of several statistical methods that are currently in use for outlier identification is presented, and their performances are compared theoretically for typical statistical distributions of experimental data, considering values derived from the distribution of extreme order statistics as reference terms. A simple modification of a popular, broadly used method based upon box-plot is introduced, in order to overcome a major limitation concerning sample size. Examples are presented concerning exploitation of methods considered on two data sets: a historical one concerning evaluation of an astronomical constant performed by a number of leading observatories and a substantial database pertaining to an ongoing investigation on absolute measurement of gravity acceleration, exhibiting peculiar aspects concerning outliers. Some problems related to outlier treatment are examined, and the requirement of both statistical analysis and expert opinion for proper outlier management is underlined.
- Published
- 2011