1. MIR_MAD: An Efficient and On-line Approach for Anomaly Detection in Dynamic Data Stream
- Author
-
Chang How Tan, Vincent C. S. Lee, and Mahsa Salehi
- Subjects
Data stream ,Mahalanobis distance ,Concept drift ,Computer science ,Dynamic data ,Feature extraction ,02 engineering and technology ,Data modeling ,Reduction (complexity) ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Anomaly detection ,Algorithm - Abstract
Anomaly detection in a dynamic data stream is a challenging task. The endless bound and high arriving rate of data prohibits anomaly detection models to store all observations in memory for processing. In addition, the dynamically moving properties of the data stream exhibit concept drift. While recent studies focus on feature extraction for anomaly detection, majority of them assume data stream are static ignoring the possibility of concept drift occurring. Anomaly detection models must operate efficiently in order to deal with high volume and velocity data, that is to have low complexity and to learn incrementally from each arriving observation. Incremental learning allows the model to adapt to concept drift. In cases where drifting rate is higher than adaptation rate, the capability to detect concept drift and retraining a new model is much preferable to minimize the performance losses. In this paper, we propose MIR_MAD, an approach based on multiple incremental robust Mahalanobis estimators that is efficient, learns incrementally and has the capability to detect concept drift. MIR_MAD is fast, can be initialized with small amount of data, and is able to estimate the drift location on the data stream. Our empirical results show that MIR_MAD achieves state-of-the-art performance and is significantly faster. We also performed a case study to show that detecting concept drift is critical to minimize the reduction in model's performance.
- Published
- 2020