Compact Directed Acyclic Word Graphs

Structures in Logic and Computer Science
Maxime.Crochemore@univ-mlv.fr
Université de Marne-la-Vallée, 1997
M. Crochemore and R. Vérin, On Compact Directed Acyclic Word Graphs, in (Structures in Logic and Computer Science, a selection of essays in honor of A. Ehrenfeucht, J. Mycielski, G. Rozenberg and A. Salomaa, eds., LNCS 1261, Springer-Verlag, 1997) pp 192--211.
In dvi, or ps format, 20 pages.

Abstract

The Directed Acyclic Word Graph (DAWG) is a space-efficient data structure to treat and analyze repetitions in a text, especially in DNA genomic sequences. Here, we consider the Compact Directed Acyclic Word Graph of a word. We give the first direct algorithm to construct it. It runs in time linear in the length of the string on a fixed alphabet. Our implementation requires half the memory space used by DAWGs.
Keywords: pattern matching algorithm, suffix automaton, DAWG, compact DAWG, suffix tree, index on text.

Contents

1. Introduction
2. Definitions
3. Compact Directed Acyclic Word Graphs
4. Direct Construction of CDAWG
5. Conclusion
References
 

Related reference

M. Crochemore and R. Vérin, Direct construction of Compact Directed Acyclic Word Graphs, in (CPM97, A. Apostolico and J. Hein, eds., LNCS 1264, Springer-Verlag, 1997) pp 116--129.
Institut Gaspard-Monge, Laboratoire d'informatique, le 6 octobre 1999, Maxime Crochemore