Back to Search Start Over

ESA-Stream: Efficient Self-Adaptive Online Data Stream Clustering.

Authors :
Li, Yanni
Li, Hui
Wang, Zhi
Liu, Bing
Cui, Jiangtao
Fei, Hang
Source :
IEEE Transactions on Knowledge & Data Engineering; Feb2022, Vol. 34 Issue 2, p617-630, 14p
Publication Year :
2022

Abstract

Many big data applications produce a massive amount of high-dimensional, real-time, and evolving streaming data. Clustering such data streams with both effectiveness and efficiency are critical for these applications. Although there are well-known data stream clustering algorithms that are based on the popular online-offline framework, these algorithms still face some major challenges. Several critical questions are still not answer satisfactorily: How to perform dimensionality reduction effectively and efficiently in the online dynamic environment? How to enable the clustering algorithm to achieve complete real-time online processing? How to make algorithm parameters learn in a self-supervised or self-adaptive manner to cope with high-speed evolving streams? In this paper, we focus on tackling these challenges by proposing a fully online data stream clustering algorithm (called ESA-Stream) that can learn parameters online dynamically in a self-adaptive manner, speedup dimensionality reduction, and cluster data streams effectively and efficiently in an online and dynamic environment. Experiments on a wide range of synthetic and real-world data streams show that ESA-Stream outperforms state-of-the-art baselines considerably in both effectiveness and efficiency. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10414347
Volume :
34
Issue :
2
Database :
Complementary Index
Journal :
IEEE Transactions on Knowledge & Data Engineering
Publication Type :
Academic Journal
Accession number :
154764001
Full Text :
https://doi.org/10.1109/TKDE.2020.2990196