Back to Search Start Over

A novel multi-core algorithm for frequent itemsets mining in data streams.

Authors :
Bustio-Martínez, Lázaro
Muñoz-Briseño, Alfredo
Cumplido, René
Hernández-León, Raudel
Feregrino-Uribe, Claudia
Source :
Pattern Recognition Letters. Jul2019, Vol. 125, p241-248. 8p.
Publication Year :
2019

Abstract

• An algorithm for multi-core frequent itemsets mining in data streams is presented. • It is proposed a processing scheme which allows to process data streams transmitted at different rates. • The Gearman framework is used to implement the proposed multi-core processing algorithm. • Hashing and lexicographic order of received items are used for frequent itemsets mining in data streams. Data streams are modern data sources that are gaining attention as a consequence of their many practical applications (they can be found in data transmission, eCommerce, and intrusion detection system among others). Nevertheless, the efforts to obtain insights from data streams are limited due to their massive information volume and the time needed to process them. In this paper, a new approach for Frequent Itemsets Mining on data streams based on prefix trees which takes advantage of multi-core systems is proposed. This approach uses the Gearman framework as the interface for multi-core processing, and it allows to exploit their scalability efficiently. Experimental results show that the proposed method obtains the same patterns compared with similar approaches reported in the state-of-the-art and outperforms them concerning the processing time required. Also, it is proved that the proposed method is insensitive to variations in the support threshold value, and its efficiency depends on the size of the transactions and not on the size of the alphabet, which is a significant issue in other Frequent Itemsets Mining algorithms. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01678655
Volume :
125
Database :
Academic Search Index
Journal :
Pattern Recognition Letters
Publication Type :
Academic Journal
Accession number :
138228511
Full Text :
https://doi.org/10.1016/j.patrec.2019.05.003