Vers le LIGM

Chloé Rispal Athénosy

Vers l'université Gustave Eiffel
Research

CURRENT RESEARCH

Collatz function

We are interested in the famous Collatz function defined for any positive integer n by f(n) = n/2 if n is even and f(n) = 3n+1 otherwise. This function, although very simple to define, is at the origin of the famous Collatz conjecture. This postulates that by applying the Collatz function a sufficient number of times from any positive integer, we reach the number 1.

This difficult conjecture, studied by many mathematicians for several decades, remains an open problem today. We do not seek to solve it here but we wanted to bring a new point of view on this function by using theoretical computer science tools to describe it: automata.

To define integer functions with automata, we first choose a base to encode integers into words. We use transducers, whose transitions are labeled by pairs of words and which accept representations in the chosen base of the pairs (n,f(n)). Of course, the Collatz function, which uses only basic arithmetic operations, has already been defined by transducers. The most natural approach is to use the reverse base 2. Indeed, the least significant digit on the left tells us immediately whether the integer is even or odd. If it is even, the division by 2 is done by erasing the 0 on the left and if it is odd, the transducer calculates the multiplication by 3 plus 1. However, when we compose the resulting transducers to represent f2 then f3 and so on, many cases arise and the transducers prove difficult to draw. No form of regularity allows to directly define, for any integer p, a transducer accepting fp.

Since for any odd integer n, the integer 3n+1 is even, we can consider the acceleration of the Collatz function f' defined for any positive integer n by f'(n)=n/2 if n is even and f'(n) = (3n+1)/2 otherwise. Our approach to realize f' is then to perform the division by 2 in direct base 3. The final state indicates whether the input corresponds to an even or odd integer. In the latter case, a terminal function allows to concatenate a digit to the right of the output to calculate the numerator 3n+1. This transducer, letter-by-letter and deterministic in input, can be composed in a standard way by performing a synchronization product. The difficulty being, of course, in the calculation of the terminal function. Thus, for any integer p, we are able to define a transducer realizing f'p from the transducer of the division by 2p in base 3 equipped with a terminal function.

For any integer p, we propose a way to draw the transducer of the division by 2p in base 3 in the form of a circle and in a regular way. We also give a recurrent definition of the terminal function that allows us to obtain an (infinite) transducer in the form of a cone, realizing the iteration of the composition f'*. It is composed of the (infinite) union of the transducers of the division by 2p in base 3 and transitions going from the level p+1 to the level p representing the terminal function. The existence of an integer p such that f'p(n) = 1 translates into the existence of a path leading from the initial vertex of level p to the vertex of level 0 labeled by the base-3 representation of the integer n in input and by 1 in output.

These results were presented in a conference (D. Caucal and C. Rispal, "On the powers of the Collatz function", Machines, Computations and Universality, To appear in Lecture Notes in Computer Science (2024)) where they received the award for the best paper of the conference and are currently being written up for a long version in a special issue of Acta Informatica.

Recognizability for automata

Recognizability provides a general approach to extract boolean algebras of languages families such as the Chomsky Hierarchy of languages, the hierarchy of indexed languages of Maslov or the family of languages accepted by Petri Nets.

Recognizability has been defined by Eilenberg in 1967: a subset of a monoïd is recognized by inverse morphism by a subset of a finite monoïd. For a given monoïd, the set of recognized sets forms a boolean algebra. This notion of recognizability has been extended for terms, finite graphs, traces, words indexed by linear orderings.

Here, recognizability is considered for automata, where an automaton is an (infinite) oriented graph labelled by letters and having initial and final sets of vertices. We define the recognizability by an automaton H according to a family F of automata as the set Rec(F,H) of languages accepted by automata of F that can be mapped by morphism into H.

If H is the automaton Loop having a unique vertex initial and final with a loop labelled by each letter of the alphabet, then any automaton can be mapped by morphism into Loop. So, the family Rec(F,Loop) is the set of languages accepted by the automata of F and is not a boolean algebra in general.

To get boolean algebras, we add a condition of structurality on morphisms. Considering a family R of binary relations, we define the family F(R) of automata such that any labelled transition is a relation of R. Then we introduce the structural recognizability according to F(R) by restricting to morphisms which are relations of R. This defines the sub-family sRec(F,H) of langages accepted by automata of F that can be structurally mapped by morphism into H.

Structural recognizability allows to define boolean algebras, not only for stack automata or stack of stack automata ,.., but also with several stacks. More precisely for any integers m and n, we define the set P(m,n) of suffix binary relations definable by m stacks of levels n. For any unambiguous automaton H of F(P(m,n)), the family sRec(F(P(m,n),H) is a boolean algebra according to the language L(H) accepted by H. This result remains true when we restrict to the family C(m,n) of suffix relations defined by m counters of levels n. In particular, C(m,1) corresponds to the vector addition systems of dimension n. We also get boolean algebras according to the family of context-sensitive languages.

Article: "Recognizability for automata" avec Didier Caucal, 22ième DLT (Developments in Language Theory), publication LNCS 11088, Springer, M. Hoshi, S. Seki (Eds.), 206-218 (2018).

Boolean algebras by length recognizability

We have just indicated that structural recognizability does not allow to obtain the Boolean algebra of visible algebraic languages. Another way to recognize Boolean algebras is to restrict ourselves to deterministic morphisms that preserve length. To do this, we only consider automata whose vertices are words. For F a family of automata and H an automaton of F, we define the family lRec(F,H) of languages accepted by the automata of F that apply to H by length-preserving morphism. For the family F of pushdown automata and for unambiguous H, lRec(F,H) is a Boolean algebra (according to the language L(H) accepted by H). On the other hand, we no longer have the closure under complementation for the family of stack of stacks automata for a deterministic automata H. Instead of restricting the families F to deterministic automata, we also ask the morphisms to be deterministic. Precisely, a morphism f applied to an automaton G is deterministic if we do not have two initial vertices of G with the same image by f, and if the goals of two arcs of G with the same source and the same label do not have the same image by f. We then consider the subfamily dlRec(F,H) of languages of automata G of F for which there exists a morphism from G to H that is both deterministic and preserves the length of the vertices.

We obtain that any family dlRec(F,H) is a Boolean algebra (according to L(H)) under two hypotheses. The first hypothesis is that the automaton H is unambiguous and length-deterministic: two initial vertices are not of the same length, and two goals of edges with the same source and label are of different length. The second hypothesis is that the family F is closed by two usual operations for finite automata. We have the product of synchronization by length giving the closure under intersection of dlRec(F,H). For the closure by complement: L(H) - L(G), we have a superposition operation by length of G on H. The Boolean algebras dlRec(F,H) are then obtained for the family F of pushdown automata which is closed under synchronization by length, and superposition by length. We have in particular the family of visibly context-free languages. The same is true for the family F of synchronized automata of bounded length difference; these automata recognize context-sensitive languages.

Article: "Boolean Algebras by Length Recognizability" with .D. Caucal, Models, Mindsets, Meta, 169-185, (2018).