Back to Search
Start Over
Compact suffix automata representations for searching long patterns.
- Source :
-
Theoretical Computer Science . Jan2023:Part A, Vol. 940, p254-268. 15p. - Publication Year :
- 2023
-
Abstract
- Automata based algorithms have always dominated the history of string matching, thanks to their intrinsic nature of recognizers. Among them, those based on the suffix automaton deserve a special mention, because its use has always led to elegant and very efficient solutions in practice. In this paper, we present two novel string matching algorithms based on the suffix automaton which are particularly suitable for very long strings. The first algorithm simulates the suffix automaton through the well known bit-parallelism approach, while the second makes use of an alternative non-standard simulation. We mainly focus on the second solution, which turns out to be more flexible in practice, and present some extensions of it. Experimental results suggest that both our solutions are very efficient in practical cases scaling much better when the length of the pattern increases. Moreover, our second approach is also competitive with the most effective solutions available for the exact string matching problem. • We present two novel suffix automata based algorithms specifically designed for very long strings. • We describe a generic non-standard approach for simulating the suffix automaton of a string. • Our first solution is a bit-parallel based extension of the BNDM algorithm which can be encoded using few bits. • Our second solution is related to the occurrence of unique factors in the pattern. • Experimental results suggest that both our solutions are very efficient in practical cases. [ABSTRACT FROM AUTHOR]
- Subjects :
- *SUFFIXES & prefixes (Grammar)
*ROBOTS
*ALGORITHMS
Subjects
Details
- Language :
- English
- ISSN :
- 03043975
- Volume :
- 940
- Database :
- Academic Search Index
- Journal :
- Theoretical Computer Science
- Publication Type :
- Academic Journal
- Accession number :
- 160400930
- Full Text :
- https://doi.org/10.1016/j.tcs.2022.11.005