Last Occurrence Table

\( \def\sa#1{\tt{#1}} \def\dd{\dot{}\dot{}} \def\dernocc{\textit{last-occ}} \)

The last occurrence table $$\dernocc : A \rightarrow \{1,2,\ldots,m \}$$ is defined as follows: $$\dernocc[a] = \min ( \{ m-1-k \mid 0 \le k \le m-2 \mbox{ et } x[k] = a \} \cup \{m\})$$ for $a \in A$

It can be computed in time $O(m + |A|)$ by the following algorithm SMA.

LastOccurrence$(x \textrm{ non-empty word of length }m)$
    \begin{algorithmic}
   \FOR{each letter $a \in A$}
      \STATE{$last\text{-}occ[a]\leftarrow m$}
   \ENDFOR
   \FOR{$i\leftarrow 0$ \TO $m-2$}
      \STATE{$last\text{-}occ[x[i]]\leftarrow m-1-i$}
   \ENDFOR
   \RETURN{$last\text{-}occ$}
    \end{algorithmic}

The last occurrence table is used in the Boyer-Moore exact string matching. It is also called the bad character shift.

References

  • T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, 3rd Edition. MIT Press, 2009.
  • M. Crochemore, C. Hancart, and T. Lecroq. Algorithms on Strings. Cambridge University Press, 2007. 392 pages.