We recommend you enable JavaScript to take full advantage of this website

Unitex/GramLab is an open source, cross-platform, multilingual,
lexicon- and grammar-based corpus processing suite

Core NLP Engine

Core NLP
Engine

The automata-oriented technology of the Unitex/GramLab Natural Language Processing engine allows to handle electronic resources such as electronic dictionaries and grammars and apply them to a text for fast processing and analysis

Language Resources

Language
Resources

The language resources are the electronic dictionaries and grammars that power Unitex analysis on textual data. Resources for more than 22 languages are currently distributed out-of-the-box with Unitex/GramLab

Visual IDE

Visual
IDE

The Visual Integrated Development Environment of Unitex/GramLab allows users to easily design and apply language resources to text files. Moreover, a project-oriented perspective enables to run projects on a single click

Open Source

Open Source

Unitex/GramLab is freely distributed under the terms of the Lesser General Public License (LGPL). This means that everyone can redistribute Unitex freely within the terms of the LGPL license. It also means that you have access to the source code of all the Unitex programs, which is included in the zip file you download. The LGPL license is more permissive than the GPL one, because it allows you to reuse the own code of Unitex/GramLab in non-free software

Cross-platform

Cross-platform

The Unitex/GramLab Core NLP Engine is written in C++, the Visual IDE is written in Java. This allows to develop Unitex-based applications on any system that supports Java 1.7, compile them with any standard C++ - compliant compiler and run them on your favorite platform: Windows, Linux, MacOS, and several others

Multilingual

Multilingual

Unitex/GramLab conforms to the Unicode 3.0 standard that allows users to handle virtually all the characters of all languages, including Asian languages. The Unitex programs have been designed to work for all writing rules. There is no difficulty in working with Asian languages, in spite of their particular spacing conventions

Lexicon-based

Lexicon-based

Unitex/GramLab works with electronic dictionaries built by the members of the RELEX network, an international network of laboratories specialized in Computational Linguistics that was created by Maurice Gross and his LADL team. Members of the RELEX network have built and are building exhaustive dictionaries for many of the LGPLLR-licensed resources distributed with Unitex/GramLab

Grammar-based

Grammar-based

Local grammars are a powerful formalism for describing syntactic or semantic rules. It consists of finite state automata coupled with electronic dictionaries to perform automatic analysis of textual data. Unitex/GramLab features a rich visual IDE which allows users to easily design, test, debug, maintain and apply local grammars on a text

Corpus Processing Suite

  • Build, check and apply electronic dictionaries
  • Apply lexicon-grammar tables
  • Align texts
  • Handle ambiguity via the text automaton
  • Build an automaton from a certified corpus
  • Pattern matching with regular expressions and recursive transition networks

Download Unitex/GramLab


  • 3.3

  • 4.0alpha