1. Compact and fast algorithms for safe regular expression search
- Author
-
Fabien Coulon, Jean-Marc Champarnaud, and Thomas Paranthoën
- Subjects
Quadratic growth ,Parsing ,Theoretical computer science ,Powerset construction ,Applied Mathematics ,computer.software_genre ,Computer Science Applications ,Automaton ,Computational Theory and Mathematics ,Bounded function ,Regular expression ,Pattern matching ,Nondeterministic finite automaton ,computer ,Algorithm ,Computer Science::Formal Languages and Automata Theory ,Mathematics - Abstract
This article describes an improvement of the brute force determinization algorithm in the case of homogeneous nondeterministic finite automata (NFAs), as well as its application to pattern matching. Brute force determinization with limited memory may provide a partially determinized automaton, but its bounded complexity makes it a safe procedure contrary to the classical subset construction. Actually, our algorithm is inspired by both recent results of Champarnaud concerning the subset automaton of a homogeneous NFA and the algorithm recently designed by Navarro and Raffinot to implement the brute force determinization of the Glushkov NFA of a regular pattern. Our algorithm significantly improves Navarro–Raffinot's one since it has an average exponentially smaller memory requirement for a given level of determinization, which, considering a bounded memory, implies a quadratically smaller parsing time. This algorithm has been implemented in CCP software (http://www.univ-rouen.fr/LIFAR/aia/ccp.html). Tests ...
- Published
- 2004