Karp-Rabin algorithmSTRING MATCHING ALGORITHMSBrute Force algorithmContents
Next: Karp-Rabin algorithm Up: ESMAJ Prev: Brute Force algorithm

Research with an automaton


Main features
Description

Searching a word x with an automaton consists first in building the minimal Deterministic Finite Automaton (DFA)  A(x) recognizing the language Sigma*x.

The DFA  A(x) =(Q, q0, T, E) recognizing the language Sigma*x is defined as follows:
  Q is the set of all the prefixes of x: Q={epsilon, x[0], x[0 .. 1], ... , x[0 .. m-2], x};
  q0=epsilon;
  T={x};
  for q in Q (q is a prefix of x) and a in Sigma, (q, a, qa) is in E if and only if qa is also a prefix of x, otherwise (q, a, p) is in E such that p is the longest suffix of qa which is a prefix of x.

The DFA  A(x) can be constructed in O(m+sigma) time and O(msigma) space.

Once the DFA  A(x) is build, searching for a word x in a text y consists in parsing the text y with the DFA  A(x) beginning with the initial state q0. Each time the terminal state is encountered an occurrence of x is reported.

The searching phase can be performed in O(n) time if the automaton is stored in a direct access table, in O(nlog(sigma)) otherwise.

The C code
void preAut(char *x, int m, Graph aut) {
   int i, state, target, oldTarget;
 
   for (state = getInitial(aut), i = 0; i < m; ++i) {
      oldTarget = getTarget(aut, state, x[i]);
      target = newVertex(aut);
      setTarget(aut, state, x[i], target);
      copyVertex(aut, target, oldTarget);
      state = target;
   }
   setTerminal(aut, state);
}
 
 
void AUT(char *x, int m, char *y, int n) {
   int j, state;
   Graph aut;
 
   /* Preprocessing */
   aut = newAutomaton(m + 1, (m + 1)*ASIZE);
   preAut(x, m, aut);
 
   /* Searching */
   for (state = getInitial(aut), j = 0; j < n; ++j) {
      state = getTarget(aut, state, y[j]);
      if (isTerminal(aut, state))
         OUTPUT(j - m + 1);
   }
}

The example

Preprocessing phase

DFA Automaton
The states are labelled by the length of the prefix they are associated with.
Missing transitions are leading to the initial state 0.

Searching phase


References


Karp-Rabin algorithmSTRING MATCHING ALGORITHMSBrute Force algorithmContents
Next: Karp-Rabin algorithm Up: ESMAJ Prev: Brute Force algorithm

e-mails: {Christian.Charras, Thierry.Lecroq} @laposte.net
Tue Jan 14 15:03:31 MET 1997