1. Benchmarking homogenization algorithms for monthly data
- Author
-
Elke Rustemeier, T. Brandsma, J. Viarre, G. Müller-Westermeier, Peter Domonkos, Claude N. Williams, G. Vertacnik, D. Rasol, T. Likso, O. Mestre, Petr Stepanek, C. Gruber, Matthew J. Menne, M. Prohom Duran, Ralf Lindau, Sorin Cheval, Michele Brunetti, Simona Fratianni, Ingeborg Auer, Victor Venema, L. Andresen, M. Klancar, K. Kolokythas, Fiorella Acquaotta, Mónika Lakatos, Enric Aguilar, Tania Marinova, Tamás Szentimrey, Pavel Zahradníček, José Antonio Guijarro, and Pere Esteban
- Subjects
Validation study ,Climate Time Series ,Computer science ,lcsh:Environmental protection ,Stratigraphy ,Homogenization (climate) ,COST Action ,Precipitation ,lcsh:Environmental pollution ,monthly homogenization algorithms ,real inhomeogeneities ,inserted inhomogeneities ,blind testing ,Temperature records ,lcsh:TD169-171.8 ,Surface climate network ,Precipitation records ,lcsh:Environmental sciences ,lcsh:GE1-350 ,Homogenization ,Global and Planetary Change ,Multiplicative function ,Temperature ,Paleontology ,Benchmarking ,Integrated approach ,Missing data ,Nonlinear system ,lcsh:TD172-193.5 ,Outlier ,Instrumental climate records ,Algorithm - Abstract
The COST (European Cooperation in Science and Technology) Action ES0601: advances in homogenization methods of climate series: an integrated approach (HOME) has executed a blind intercomparison and validation study for monthly homogenization algorithms. Time series of monthly temperature and precipitation were evaluated because of their importance for climate studies and because they represent two important types of statistics (additive and multiplicative). The algorithms were validated against a realistic benchmark dataset. The benchmark contains real inhomogeneous data as well as simulated data with inserted inhomogeneities. Random independent break-type inhomogeneities with normally distributed breakpoint sizes were added to the simulated datasets. To approximate real world conditions, breaks were introduced that occur simultaneously in multiple station series within a simulated network of station data. The simulated time series also contained outliers, missing data periods and local station trends. Further, a stochastic nonlinear global (network-wide) trend was added. Participants provided 25 separate homogenized contributions as part of the blind study. After the deadline at which details of the imposed inhomogeneities were revealed, 22 additional solutions were submitted. These homogenized datasets were assessed by a number of performance metrics including (i) the centered root mean square error relative to the true homogeneous value at various averaging scales, (ii) the error in linear trend estimates and (iii) traditional contingency skill scores. The metrics were computed both using the individual station series as well as the network average regional series. The performance of the contributions depends significantly on the error metric considered. Contingency scores by themselves are not very informative. Although relative homogenization algorithms typically improve the homogeneity of temperature data, only the best ones improve precipitation data. Training the users on homogenization software was found to be very important. Moreover, state-of-the-art relative homogenization algorithms developed to work with an inhomogeneous reference are shown to perform best. The study showed that automatic algorithms can perform as well as manual ones.
- Published
- 2012