String Matching Automaton

\( \def\sa#1{\tt{#1}} \def\dd{\dot{}\dot{}} \def\aSMot{\mathcal{SM}} \)

Given a word $x$ on an alphabet $A^*$ the String Matching Automaton ${\cal D}({x})$ recognizes the language $A^*\{a\}x$. It is defined as follows.

${\cal D}({x}) = (A,Q,q_0,T,F)$ where:

Transitions in the first set are called forward transitions while those in the second set are called backward transitions. Backward transitions $(u,a,v)$ such that $v \ne \varepsilon$ are called significant backward transitions.

The string matching automaton can be constructed in time $O(m \times |A|)$ by the algorithm SMA.

SMA$(x \textrm{ non-empty word of length }m)$
    \begin{algorithmic}
   \STATE{$q_0\leftarrow \mbox{NewState}$}
   \STATE{$Q\leftarrow \{q_0\}$}
   \FOR{each letter $b \in A$}
      \STATE{$F\leftarrow F\cup \{ (q_0,b,q_0) \}$}
   \ENDFOR
   \STATE{$t\leftarrow q_0$}
   \FOR{$i\leftarrow 0$ \TO $m-1$}
      \STATE{$p\leftarrow \mbox{NewState}$}
      \STATE{$Q\leftarrow Q\cup\{p\}$}
      \STATE{$r\leftarrow \mbox{Target}(t,x[i])$}
      \STATE{$F\leftarrow F \setminus \{ (t,x[i],r) \}$}
      \STATE{$F\leftarrow F \cup\{ (t,x[i],p) \}$}
      \FOR{each $(r,a,q) \in F$}
         \STATE{$F\leftarrow F \cup \{ (p,a,q) \}$}
      \ENDFOR
      \STATE{$t\leftarrow p$}
   \ENDFOR
   \STATE{$T\leftarrow \{p\}$}
   \RETURN{$(A,Q,q_0,T,F)$}
    \end{algorithmic}
where algorithm Target can be implemented as follows.
Target$(q \textrm{ a state and } a \mbox{ a letter})$
    \begin{algorithmic}
   \IF{$\mbox{there exists } (q,a,p) \in F$}
      \RETURN{$p$}
   \ELSE
      \RETURN{NIL}
   \ENDIF
    \end{algorithmic}

References

  • A. V. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974.
  • T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, 3rd Edition. MIT Press, 2009.
  • M. Crochemore, T. Lecroq, and W. Rytter. 125 Problems in Text Algorithms. Cambridge University Press, 2021.