Back to Search Start Over

Compact suffix automata representations for searching long patterns.

Authors :
Faro, Simone
Scafiti, Stefano
Source :
Theoretical Computer Science. Jan2023:Part A, Vol. 940, p254-268. 15p.
Publication Year :
2023

Abstract

Automata based algorithms have always dominated the history of string matching, thanks to their intrinsic nature of recognizers. Among them, those based on the suffix automaton deserve a special mention, because its use has always led to elegant and very efficient solutions in practice. In this paper, we present two novel string matching algorithms based on the suffix automaton which are particularly suitable for very long strings. The first algorithm simulates the suffix automaton through the well known bit-parallelism approach, while the second makes use of an alternative non-standard simulation. We mainly focus on the second solution, which turns out to be more flexible in practice, and present some extensions of it. Experimental results suggest that both our solutions are very efficient in practical cases scaling much better when the length of the pattern increases. Moreover, our second approach is also competitive with the most effective solutions available for the exact string matching problem. • We present two novel suffix automata based algorithms specifically designed for very long strings. • We describe a generic non-standard approach for simulating the suffix automaton of a string. • Our first solution is a bit-parallel based extension of the BNDM algorithm which can be encoded using few bits. • Our second solution is related to the occurrence of unique factors in the pattern. • Experimental results suggest that both our solutions are very efficient in practical cases. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
03043975
Volume :
940
Database :
Academic Search Index
Journal :
Theoretical Computer Science
Publication Type :
Academic Journal
Accession number :
160400930
Full Text :
https://doi.org/10.1016/j.tcs.2022.11.005