Back to Search Start Over

一种基于距离和采样机制的数据流分类方法.

Authors :
胡学钢
何俊宏
李培培
Source :
Application Research of Computers / Jisuanji Yingyong Yanjiu. Apr2018, Vol. 35 Issue 4, p992-1000. 5p.
Publication Year :
2018

Abstract

Data stream classification is widely used in sensor networks, network monitoring and other real-world applications. However, the problem of class imbalance and label missing in data stream greatly aggravates the difficulty of data stream classification. Therefore, this paper proposed an ensemble classification method based on distance evaluation and sampling to solve the problem of incomplete labeled data stream classification with imbalanced class distribution. The proposed method first calculated the distance between the unlabeled data and the center point of the labeled data chunks to partition the positive and negative instances. Secondly, in order to balance the class distribution of the current data chunk, the data chunk was reconstructed by over-sampling positive instances and under-sampling negative instances, and then it was used to build an ensemble classification model. Experiments on the simulated incomplete labeled data stream with class imbalance show that the proposed method can improve the classification accuracy while reducing the influence of imbalanced class distribution as compared with the classical similar algorithm. [ABSTRACT FROM AUTHOR]

Details

Language :
Chinese
ISSN :
10013695
Volume :
35
Issue :
4
Database :
Academic Search Index
Journal :
Application Research of Computers / Jisuanji Yingyong Yanjiu
Publication Type :
Academic Journal
Accession number :
175657823
Full Text :
https://doi.org/10.3969/j.issn.1001-3695.2018.04.007