1. Highly comparative time-series analysis
- Author
-
Fulcher, Benjamin D., Jones, Nick S., and Little, Max
- Subjects
519.56 ,Bioinformatics (life sciences) ,Pattern recognition (statistics) ,time-series analysis ,classification ,high throughput methods - Abstract
In this thesis, a highly comparative framework for time-series analysis is developed. The approach draws on large, interdisciplinary collections of over 9000 time-series analysis methods, or operations, and over 30 000 time series, which we have assembled. Statistical learning methods were used to analyze structure in the set of operations applied to the time series, allowing us to relate different types of scientific methods to one another, and to investigate redundancy across them. An analogous process applied to the data allowed different types of time series to be linked based on their properties, and in particular to connect time series generated by theoretical models with those measured from relevant real-world systems. In the remainder of the thesis, methods for addressing specific problems in time-series analysis are presented that use our diverse collection of operations to represent time series in terms of their measured properties. The broad utility of this highly comparative approach is demonstrated using various case studies, including the discrimination of pathological heart beat series, classification of Parkinsonian phonemes, estimation of the scaling exponent of self-affine time series, prediction of cord pH from fetal heart rates recorded during labor, and the assignment of emotional content to speech recordings. Our methods are also applied to labeled datasets of short time-series patterns studied in temporal data mining, where our feature-based approach exhibits benefits over conventional time-domain classifiers. Lastly, a feature-based dimensionality reduction framework is developed that links dependencies measured between operations to the number of free parameters in a time-series model that could be used to generate a time-series dataset.
- Published
- 2012