Back to Search Start Over

Efficient Uncertain Sequence Pattern Mining Based on Hadoop Platform.

Authors :
Wu, Jimmy Ming-Tai
Liu, Shuo
Lin, Jerry Chun-Wei
Source :
Journal of Circuits, Systems & Computers; Oct2022, Vol. 31 Issue 15, p1-15, 15p
Publication Year :
2022

Abstract

In the Internet of Things (IoT) era, information is collected by sensor devices, resulting in data loss or uncertain data and other consequences. We need to represent the uncertain data collected using probabilities to extract the useful information for production and application from a huge indeterminate data warehouse. The data in the database has a particular order in time or space, so the High-Utility Probability Sequential Pattern Mining (HUPSPM) has become a new investigation and analysis topic in data processing. After the progress of timestamp, many efficient algorithms for sequential mining have been developed. However, these algorithms have a limitation: they can only be executed in a stand-alone environment and are only suitable for small datasets. Therefore, introducing an advanced graph framework for processing large datasets addresses the shortcomings of the existing methods. The proposed algorithm can avoid repeated database searching, splitting the database, and improve the parallel computing capability. The initial database is pruned according to the existing pruning strategy to effectively reduce the number of candidate sets effectively. Experiments show that the algorithm presented in this paper has excellent advantages in mining high-utility probability sequences in large datasets. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02181266
Volume :
31
Issue :
15
Database :
Complementary Index
Journal :
Journal of Circuits, Systems & Computers
Publication Type :
Academic Journal
Accession number :
159174783
Full Text :
https://doi.org/10.1142/S0218126622502619