1. A Distributed Frequent Itemset Mining Algorithm for Uncertain Data.
- Author
-
Jiaman Ding, Haibin Li, Yang Yang, Lianyin Ji, and Jinguo You
- Subjects
BIG data ,ALGORITHMS ,DATA mining - Abstract
With the rapidly expansion of big data in all domains, it has become a major research topic to improve the performance of mining frequent patterns in massive uncertain datasets in recent years. Most conventional frequent pattern mining approaches take expect, probability, or weight as one single factor of item support, and algorithms that consider both probability and weight are unable to balance execution efficiency under the circumstances of big data. Therefore, we propose a distributed frequent itemset mining algorithm for uncertain data: Dfimud. Firstly, Dfimud calculates the maximum probability weight value of 1-items and prunes the items whose value is less than the given threshold. Secondly, to reduce the times of scanning the datasets, a distributed Dfimud-tree structure inspired by FP-Tree is designed to mine frequent patterns. Finally, experiments on publicly available UCI datasets demonstrate that Dfimud achieves more optimal results than other related approaches across various metrics. In addition, the empirical study also shows that Dfimud has good scalability. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF