Directed Acyclic Subsequence Graph

Maxime.Crochemore@univ-mlv.fr
University of Marne-la-Vallée, June 1999
M. Crochemore and Z. Tronicek, Directed Acyclic Subsequence Graph for multiple texts, Technical Report, IGM-99-??, Institut Gaspard-Monge, 1999, 19 pages.
In ps format, 19pp.

Abstract

The subsequence matching problem is to decide, for given strings S and T, whether S is a subsequence of T. The string S is called pattern and the string T text. We consider the case of multiple texts and show how to solve the subsequence matching problem in time linear in the length of pattern. For this purpose we build the automaton that accepts all subsequences of given texts. The automaton is called Directed Acyclic Subsequence Graph (DASG) and we present an algorithm for its building. We also prove upper bound for its number of states. Further, we consider modification of the subsequence matching problem: given a string S and a finite language L, we are to decide whether S is a subsequence of any string in L. We suppose that a finite automaton accepting L is given and present an algorithm for building the DASG for the language L. We also mention applications of DASG to some problems related to subsequences.
Keywords: finite automaton, pattern matching, subsequence, acyclic graph.

Contents

  1. Introduction
  2. DASG for one text
  3. DASG for a set of texts
  4. DASG for a finite language
 

Related reference

R. A. Baeza-Yates. Searching subsequences. Theor. Comput. Sci., 78(2):363--376, 1991.
Institut Gaspard-Monge, Laboratoire d'informatique, le 6 octobre 1999, Maxime Crochemore