String Matching Automaton
\(
\def\sa#1{\tt{#1}}
\def\dd{\dot{}\dot{}}
\def\aSMot{\mathcal{SM}}
\)
Given a word $x$ on an alphabet $A^*$
the String Matching Automaton
${\cal D}({x})$
recognizes the language $A^*\{a\}x$.
It is defined as follows.
${\cal D}({x}) = (A,Q,q_0,T,F)$
where:
-
$A$ is the alphabet;
-
$Q = Pref(x)$ is the state of states;
-
$q_0 = \varepsilon$ is the inital state;
-
$T = \{x\}$ is the set of terminal states;
-
$F = \{ (u,a,ua) \mid u \in Q, a \in A, ua \in Q, ua \mbox{ is a prefix of } x \} \cup$
$\{ (u,a,Border(ua)) \mid u \in Q, a \in A, ua \mbox{ is not a prefix of } x \}$
is the set of transitions.
Transitions in the first set are called
forward transitions
while those in the second set are called
backward transitions.
Backward transitions $(u,a,v)$ such that $v \ne \varepsilon$
are called
significant backward transitions.
The string matching automaton can be constructed in time $O(m \times |A|)$ by the algorithm SMA.
SMA$(x \textrm{ non-empty word of length }m)$
\begin{algorithmic}
\STATE{$q_0\leftarrow \mbox{NewState}$}
\STATE{$Q\leftarrow \{q_0\}$}
\FOR{each letter $b \in A$}
\STATE{$F\leftarrow F\cup \{ (q_0,b,q_0) \}$}
\ENDFOR
\STATE{$t\leftarrow q_0$}
\FOR{$i\leftarrow 0$ \TO $m-1$}
\STATE{$p\leftarrow \mbox{NewState}$}
\STATE{$Q\leftarrow Q\cup\{p\}$}
\STATE{$r\leftarrow \mbox{Target}(t,x[i])$}
\STATE{$F\leftarrow F \setminus \{ (t,x[i],r) \}$}
\STATE{$F\leftarrow F \cup\{ (t,x[i],p) \}$}
\FOR{each $(r,a,q) \in F$}
\STATE{$F\leftarrow F \cup \{ (p,a,q) \}$}
\ENDFOR
\STATE{$t\leftarrow p$}
\ENDFOR
\STATE{$T\leftarrow \{p\}$}
\RETURN{$(A,Q,q_0,T,F)$}
\end{algorithmic}
where algorithm Target can be implemented as follows.
Target$(q \textrm{ a state and } a \mbox{ a letter})$
\begin{algorithmic}
\IF{$\mbox{there exists } (q,a,p) \in F$}
\RETURN{$p$}
\ELSE
\RETURN{NIL}
\ENDIF
\end{algorithmic}
References
A. V. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis
of Computer Algorithms. Addison-Wesley, 1974.
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction
to Algorithms, 3rd Edition. MIT Press, 2009.
M. Crochemore, T. Lecroq, and W. Rytter. 125 Problems in Text Algorithms.
Cambridge University Press, 2021.