Back to Search Start Over

HUSP-SP: Faster Utility Mining on Sequence Data.

Authors :
CHUNKAI ZHANG
YUTING YANG
ZILIN DU
WENSHENG GAN
YU, PHILIP S.
Source :
ACM Transactions on Knowledge Discovery from Data; Jan2024, Vol. 18 Issue 1, p1-21, 21p
Publication Year :
2024

Abstract

High-utility sequential pattern mining (HUSPM) has emerged as an important topic due to its wide application and considerable popularity. However, due to the combinatorial explosion of the search space when the HUSPM problem encounters a low-utility threshold or large-scale data, it may be time-consuming and memory-costly to address the HUSPM problem. Several algorithms have been proposed for addressing this problem, but they still cost a lot in terms of running time and memory usage. In this article, to further solve this problem efficiently, we design a compact structure called sequence projection (seqPro) and propose an efficient algorithm, namely, discovering high-utility sequential patterns with the seqPro structure (HUSPSP). HUSP-SP utilizes the compact seq-array to store the necessary information in a sequence database. The seqPro structure is designed to efficiently calculate candidate patterns' utilities and upper-bound values. Furthermore, a new upper bound on utility, namely, tighter reduced sequence utility and two pruning strategies in search space, are utilized to improve the mining performance of HUSP-SP. Experimental results on both synthetic and real-life datasets show that HUSP-SP can significantly outperform the state-of-the-art algorithms in terms of running time, memory usage, search space pruning efficiency, and scalability. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
15564681
Volume :
18
Issue :
1
Database :
Complementary Index
Journal :
ACM Transactions on Knowledge Discovery from Data
Publication Type :
Academic Journal
Accession number :
173194012
Full Text :
https://doi.org/10.1145/3597935