A factor u of a word w is (right) univalent if there exists a unique
letter a such that ua is still a factor of w. A univalent factor is
minimal if none of its proper suffixes is univalent. The starting block of
a non-empty word w is the shortest univalent prefix of w such that all
longer proper prefixes of w are univalent. We study univalent factors of
a word and their relationship with the well known notions of boxes,
superboxes, and minimal forbidden factors.
Moreover, we prove some new uniqueness conditions for words based on
univalent factors. In particular, we show that a word is uniquely
determined by its starting block, the set of the extensions of its minimal
univalent factors, and its length or its terminal box. Finally, we show
how the results and techniques presented can be used to solve the problem
of sequence assembly for DNA molecules, under reasonable assumptions on the
repetitive structure of the considered molecule and on the set of known
fragments.
Key words: Univalent factors, boxes, sequence assembly.