Back to Search Start Over

NETASPNO: Approximate Strict Pattern Matching Under Nonoverlapping Condition

Authors :
Youxi Wu
Shasha Li
Jingyu Liu
Lei Guo
Xindong Wu
Source :
IEEE Access, Vol 6, Pp 24350-24361 (2018)
Publication Year :
2018
Publisher :
IEEE, 2018.

Abstract

In pattern matching, a gap constraint is a more flexible wildcard than traditional wildcards β€œ?”and β€œ*”. Pattern matching with gap constraints is more difficult to handle and fulfills user's enquiries more easily. Pattern matching with gap constraints has therefore been carried out in numerous research works, such as music information retrieval, searching protein sites, and sequence pattern mining. Strict pattern matching under a nonoverlapping condition, as a type of pattern matching with gap constraints, is a key issue of sequence pattern mining with gap constraints since it can be used to compute the frequency of a pattern. Exact matching limits the flexibility of the match to some extent since it requires each character to be matched exactly. We therefore address approximate strict pattern matching under the nonoverlapping constraints (ASPNO) and propose an effective algorithm, named NETtree for ASPNO (NETASPNO), which first transforms the problem into a Nettree data structure, an extensive tree structure. To find the nonoverlapping occurrences effectively, we propose the concept of number of roots paths with distance constraints (NRPDC) which indicates the number of path from a node to the roots with distanced and can be used to delete useless parent-child relationships and useless nodes. We iteratively recalculate the NRPDCs of each node on the subnettree with the rightmost root. Then we can get a path from the rightmost leaf to its rightmost root without using the backtracking strategy. NETASPNO therefore iteratively gets the rightmost root-leaf-path and prunes the path on the Nettree. Extensive experimental results demonstrate that NETASPNO has better performance than the other competitive algorithms.

Details

Language :
English
ISSN :
21693536
Volume :
6
Database :
Directory of Open Access Journals
Journal :
IEEE Access
Publication Type :
Academic Journal
Accession number :
edsdoj.f7870e5d90ad4c2fb3f84adee6f322c7
Document Type :
article
Full Text :
https://doi.org/10.1109/ACCESS.2018.2832209