Quick Search algorithmESMAJReverse Colussi algorithmContents
Next: Quick Search algorithm Up: ESMAJ Prev: Reverse Colussi algorithm

Horspool algorithm


Main features
Description

The bad-character shift used in the Boyer-Moore algorithm (see chapter Boyer-Moore algorithm) is not very efficient for small alphabets, but when the alphabet is large compared with the length of the pattern, as it is often the case with the ASCII table and ordinary searches made under a text editor, it becomes very useful.
Using it alone produces a very efficient algorithm in practice. Horspool proposed to use only the bad-character shift of the rightmost character of the window to compute the shifts in the Boyer-Moore algorithm.

The preprocessing phase is in O(m+sigma) time and O(sigma) space complexity.

The searching phase has a quadratic worst case but it can be proved that the average number of comparisons for one text character is between 1/sigma and 2/(sigma+1).

The C code

The function preBmBc is given chapter Boyer-Moore algorithm.

void HORSPOOL(char *x, int m, char *y, int n) {
   int j, bmBc[ASIZE];
   char c;

   /* Preprocessing */
   preBmBc(x, m, bmBc);

   /* Searching */
   j = 0;
   while (j <= n - m) {
      c = y[j + m - 1];
      if (x[m - 1] == c && memcmp(x, y + j, m - 1) == 0)
         OUTPUT(j);
      j += bmBc[c];
   }
}

The example

Preprocessing phase

Horspool algorithm bmBc table
bmBc table used by Horspool algorithm

Searching phase


References


Quick Search algorithmESMAJReverse Colussi algorithmContents
Next: Quick Search algorithm Up: ESMAJ Prev: Reverse Colussi algorithm

e-mails: {Christian.Charras, Thierry.Lecroq }@laposte.net
Tue Jan 14 15:03:31 MET 1997