Eric Laporte's Publications

Eric Laporte's Publications

1986-1989
1990-1994
1995-1999
2000-2004
2005-

1986-1989

Danlos, Laurence, Françoise Emerard, Eric Laporte, 1986. "Synthesis of Spoken Messages from Semantic Representations (Semantic-Representation-to-Speech System)", Proceedings of Coling 1986, Bonn, pp. 599-604. doi:10.3115/991365.991540

1986. "Applications de la morphophonologie à la production automatique de textes phonétiques", Lexiques et traitement automatique des langages, Actes du séminaire, GRECO "Communication parlée" et GALF, University Paul-Sabatier, Toulouse, 12 p.

1987. "Prise en compte des variations phonétiques en reconnaissance de la parole", Actes des 16es Journées d'étude sur la parole, Société française d'acoustique, Hammamet, pp.153-156.

1987. "Représentation formelle d'informations phonétiques", Dictionnaires électroniques et analyse automatique du français, Report no. 4 of the Programme de recherches coordonnées "Informatique linguistique", LADL, University Paris 7, pp.21-57.

1987. Bibliographic notice about Phonétique historique du français, Gaston Zink, 1986, Lingvisticae Investigationes XI:2, Amsterdam-Philadelphie: Benjamins, p. 427.

1988. "La reconnaissance des expressions figées lors de l'analyse automatique", Langages 90, Les expressions figées, Laurence Danlos ed., Paris: Larousse, pp.117-126.

1988. Méthodes algorithmiques et lexicales de phonétisation de textes. Applications au français, Ph.D., University Paris 7, 162 p. + vol. 2 (annexes).

1988. Phonétisation de textes par un réseau connexionniste, Internal report, LIMSI, Orsay, 21 p.

1989. "Applications of Phonetic Description", LNCS 377, Electronic Dictionaries and Automata in Computational Linguistics, LITP Spring School on Theoretical Computer Science, Saint-Pierre d'Oléron, France, May 1987, Proceedings, Maurice Gross and Dominique Perrin eds., Berlin-New York: Springer-Verlag, pp.66-78.

Gross, Maurice, Eric Laporte, Marcel-Paul Schützenberger, May 1989. "La réforme de l'orthographe. L'informatique linguistique teste les qualités des réformes", Pour la science no. 139, Paris, pp.6-8.

Eric Laporte, Max Silberztein, 1989. "Vérification et correction orthographiques assistées par ordinateur", Actes de la Convention "Intelligence artificielle 1989", Paris: Hermès, vol. 1, pp. 283-298.

1989. "Quelques variations phonétiques en français", Lingvisticae Investigationes XIII:1, Amsterdam-Philadelphie: Benjamins, pp.43-116.

1989. "La phonétisation automatique de textes français", La description des langues naturelles en vue d'applications linguistiques, Actes du colloque, Conrad Ouellon ed., CIRB, University Laval, Québec, pp.187-196.

1989. "La flexion en français : conjugaisons, pluriels, féminins", Linguistica communicatio, vol. 1, no. 2, Faculty of literature, Fez, Maroc, pp.42-63.

1990-1994

1990. "Le dictionnaire phonémique DELAP", Langue française 87, Dictionnaires électroniques du français, Blandine Courtois and Max Silberztein eds., Paris: Larousse, pp.59-70.

Eric Laporte, ed., 1990. Recueil de tables de noms prédicatifs, Technical report no. 22 of LADL, Paris.

1991. "A Formal Tool for Modelling "Standard" Phonetic Variations", The Phonetics and Phonology of Speaking Styles: Reduction and Elaboration in Speech Communication, Joaquim Llisterri ed., Barcelona: ESCA, pp. 39-1 to 39-7.

Courtois, Blandine, Maurice Gross, Eric Laporte, Max Silberztein, 1991. Étude et développement d'extensions d'un système de vérification orthographique, Final report of research contract no. 88 S075, LADL, 16p.

1991. "Extending the Coverage of Derivatives in a Dictionary", 7p.; "Phonetics and Transducers", 3p., Technical report no. 31 of LADL, Paris.

1991. "Une expérience de dépouillement de textes : les mots non reconnus", 1991, Intermediate report of Genelex-Autogen, pp.1-11.

1991. Règles de phonétisation : format, interprétation, comptages, Internal report of CERIL, Évry.

1991. Dictionnaires de formes fléchies phonétiques : présentation, Internal report of CERIL, Évry.

1992. "Adjectifs en -ant dérivés de verbes", Langue française 96, La productivité lexicale, André Dugas and Christian Molinier eds., Paris: Larousse, pp. 30-43.

1992. "Phonetic Syllables in French: Combinatorics, Structure and Formal Definitions", Acta Linguistica Hungarica 41 (1-4), Budapest: Akadémiai Kiado, pp.175-189.

Alcouffe, Philippe, Eric Laporte, Bruno Revellin-Falcoz, Laurence Zaysser, 1992. Dictionnaire Dioxydes. Les données syntaxiques, Internal report of Genelex, 215 p.

1992. "Genelex doit-il proposer un alphabet phonémique ?", Enrichissement de dictionnaires, Internal report of Genelex, 3 p.

1992. La constitution du dictionnaire de base Genelex, Internal report of Genelex, 5 p. + 18 p. annexes.

1992. Phonétisation par transducteurs non déterministes, Internal report of Institut Gaspard-Monge, University of Marne-la-Vallée.

1993. "Separating Entries in Electronic Dictionaries of French", Sprache - Kommunikation - Informatik. Akten des 26. Linguistischen Kolloquiums, Poznan 1991, J. Darski and Z. Vetulani eds., Tübingen: Max Niemeyer, pp.173-179.

1993. Phonétique et transducteurs. Mémoire d'habilitation, Université Paris 7, 21 p. + articles.

1994. "Experiments in Lexical Disambiguation Using Local Grammars", Papers in Computational Lexicography, COMPLEX '94, Ferenc Kiefer, Gabor Kiss and Julia Pajzs eds., Budapest: Linguistics Institute of the Hungarian Academy of Sciences, pp.163-172.

1994. "Levée d'ambiguïtés par grammaires locales", in Lexiques-grammaires comparés en français. Actes du Colloque international de Montréal (3-5 juin 1992), J. Labelle and Ch. Leclère eds., Lingvisticae Investigationes Supplementa 17, Amsterdam/Philadelphie: Benjamins, pp. 97-114.

1995-1999

1995. "Appropriate nouns with obligatory modifiers", Language Research 31(2), Seoul National University, ISSN 0254-4474, pp. 251-289. Presented at the 4th Korean-French Conference on Grammar and the Lexicon, National University of Seoul, 1994.

Eric Laporte, ed., 1995. Periodic Progress Report. Project Copernicus 621 GRAMLEX. University Paris 7, LADL, 83 p.

Eric Laporte, Max Silberztein, 1995. Analysis of French corpora. GRAMLEX report no. 3A1, 14 p.

1996. "Context-free parsing with finite-state transducers", in Proceedings of the 3rd South American Workshop on String Processing, N. Ziviani et al. (eds.), International Informatics Series 4, Montréal: McGill-Queen's University Press; and Ottawa: Carleton University Press, pp. 171-182.

1996. "Évaluation de la levée d'ambiguïtés lexicales", in LINX 34-35, Lexique, syntaxe et analyse automatique des textes. Hommage à Jean Dubois, Nanterre: University Paris X, pp. 291-305.

1996. "How many syllables are frequent?", Technical report 96-12, IGM, 10 p.

Eric Laporte, ed., 1996. Project Copernicus 621 GRAMLEX. Deliverables. October 1995-June 1996. University Paris 7, LADL, 197 p.

Blandine Courtois, Eric Laporte, Alexis Némé, 1996. Acquisition programs for lexical data. GRAMLEX report no. 1A2, 30 p.

Eric Laporte, Max Silberztein, 1996. Ambiguity rates. GRAMLEX report no. 3A2, 7 p. [PDF (1,7 Mb)]

1996. Construction of concordances sorted by lemmata of French text corpora. GRAMLEX report no. 3A3, 5 p.

1996. Separation of homographic entries. GRAMLEX report no. 4A2, 16 p.

Blandine Courtois, Eric Laporte, 1996. Grammatical disambiguation of French words using part of speech of words in context. GRAMLEX report no. 3D1, 10 p. + 68 p. annex.

Eric Laporte, ed., 1996. Project Copernicus 621 GRAMLEX. Deliverables. June-December 1996. University Paris 7, LADL, 206 p.

Eric Laporte, Mario Monteleone, 1996. Morphological dictionaries in the framework of the two-level model. French and Italian. GRAMLEX report no. 1G1-1J1, 39 p.

Eric Laporte, Mario Monteleone, 1996. Experimentation of the two-level model for French and Italian morphology. GRAMLEX report no. 1G2-1J2, 6 p.

1997. "Rational Transductions for Phonetic Conversion and Phonology", in E. Roche and Y.Schabès eds., Finite-State Language Processing, chap. 14. Language, Speech and Communication series. Cambridge: MIT Press, pp. 407-429.

1997. "Les Mots. Un demi-siècle de traitements", Traitement automatique des langues 38(2), État de l'art, Paris: ATALA, pp. 47-68.

Eric Laporte, Anne Monceaux, 1997. Grammatical disambiguation of French words using part of speech, inflectional features and lemma of words in the context. GRAMLEX report no. 3D2, 11 p.

1997. "Noms appropriés à modifieur obligatoire", Langages 126, La description syntaxique des adjectifs pour les traitements informatiques, Nam Jee-sun ed., Paris: Larousse, pp. 79-104. French version of the article in Language Research 31(2).

1997. "Phonology and Electronic Lexicon: Processing of Ambiguities", Indo-French Workshop on Natural Language Processing, Hyderabad, March 21-26, 10 p.

Eric Laporte, ed., 1997. Project Copernicus 621 GRAMLEX. Deliverables. December 1996-May 1997. 2. Other tasks. University Paris 7, LADL, 137 p.

Eric Laporte, ed., 1997. Project Copernicus 621 GRAMLEX. Deliverables. May-September 1997. University Paris 7, LADL, 179 p.

1998. "Lexical disambiguation with fine-grained tagsets", in J. Ginzburg et al., ed., The Tbilisi Symposium in Logic, Language and Computation: Selected Papers. 19-22 October 1995, Gudauri, Georgia. Studies in Logic, Language and Information, Cambridge: Cambridge University Press; and Stanford: CSLI and FoLLI, pp. 203-210.

1998. Foreword of Dictionary Based Methods and Tools for Language Engineering, Vetulani (Z.) et al., Seria Jezykoznawstwo Komputerowe. Poznan: Adam Mickiewicz University Press, pp. 11-12.

Eric Laporte, ed., 1998. Project Copernicus 621 GRAMLEX. Deliverables. October 1997-April 1998. University Paris 7, LADL.

1998. Synthesis of results. GRAMLEX report no. 4E1-4E2, 6 p.

Eric Laporte, ed., 1999. Langages 133, Lexique-grammaire des adjectifs, Paris: Larousse. Presentation, pp. 3-11.

Strahil Ristov, Eric Laporte, 1999. "Ziv Lempel Compression of Huge Natural Language Data Tries Using Suffix Arrays", in LNCS 1645, Combinatorial Pattern Matching, 10th Annual Symposium, Warwick University, UK, July 1999, Proceedings, M. Crochemore, M. Paterson, eds., Berlin: Springer, pp. 196-211.

Éric Laporte, Anne Monceaux, 1999. "Elimination of lexical ambiguities by grammars. The ELAG system", Lingvisticae Investigationes XXII, Amsterdam-Philadelphie : Benjamins, pp. 341-367.

2000-2004

Strahil Ristov, Éric Laporte, 2000. "Ziv Lempel Compression of Huge Natural Language Data Tries Using Suffix Arrays", Journal of Discrete Algorithms, 1 (1), M. Crochemore, L. Gasieniec, eds., Oxford: Hermes, pp. 241-256.

2000. "Mots et niveau lexical", in J.M. Pierrel, ed., Ingénierie des langues. Informatique et systèmes d'information, Paris: Hermes, pp. 25-49.

2000. "A Lingüística para o processamento das línguas", Recortes Lingüísticos, A. Silva e M. Lins (eds.), Vitória, Brazil : Saberes, pp. 67-75. Talk given in April 2000 at the Federal University of Espírito Santo.

2001. "Resolução de ambiguidades", in E. Ranchhod, ed., Tratamento das Línguas por Computador. Uma introdução à Linguística Computacional e suas aplicações. Lisbon: Caminho, pp. 49-89 (an English version has been published in Lingvisticae Investigationes XXIV:1).

Éric Laporte, Claude Martineau, Marc Zipstein, 2001. Compactage des données, Final report, Transweb 2, University of Marne-la-Vallée, 12 p. + annexes.

2001. "Reduction of lexical ambiguity", Lingvisticae Investigationes XXIV:1, Amsterdam-Philadelphie : Benjamins, pp. 67-103.

Strahil Ristov, Éric Laporte, 2002. "A Method for Compressing Lexicons", Poster, Data Compression Conference (DCC), Snowbird, Utah, IEEE Computer Society Press, p. 471. PS (70 Kb).

Ken Beesley, Lauri Karttunen, Eric Laporte, Kemal Oflazer, eds., 2003. Machine Translation 18:3, September 2003, Special Issue, Finite-State Language Resources and Language Processing, Springer Netherlands, 78 p.

Christian Leclère, Eric Laporte, Mireille Piot, Max Silberztein, eds., 2004. Syntax, Lexis and Lexicon-Grammar. Papers in honour of Maurice Gross, Lingvisticae Investigationes Supplementa 24, Amsterdam-Philadelphie : Benjamins, 22 + 659 p. Review by Thierry Fontenelle. Review by Sara Vecchiato in Studi Francesi 150 (2006).

2004. "Restructuration and the subject of adjectives", in Syntax, Lexis and Lexicon-Grammar. Papers in honour of Maurice Gross, Lingvisticae Investigationes Supplementa 24, Amsterdam-Philadelphie : Benjamins, pp. 373-388. Zipped files (32 Kb).

2004. Foreword of Syntax, Lexis and Lexicon-Grammar. Papers in honour of Maurice Gross, Lingvisticae Investigationes Supplementa 24, Amsterdam-Philadelphie : Benjamins, pp. xi-xxi. 55 Kb.

2004. "Acceptability as the source of syntactic knowledge", Journal of Applied Linguistics, October 2004, Special Issue on Lexicon-Grammar, Beijing: Institute of Applied Linguistics, pp. 9-22 (in Chinese).

Eric Laporte, Cheng Ting-au, eds., 2004. Journal of Applied Linguistics, October 2004, Special Issue on Lexicon-Grammar, Beijing: Institute of Applied Linguistics, 160 p. (in Chinese).

2004. "Uma descrição sintática e semântica dos adjetivos do francês para aplicações computacionais", PaLavra 12, Series Language, thematic volume: Processamento Automático do Português, DIAS, Maria Carmelita & QUENTAL, Violeta (eds.), ISSN 1413-7763, Rio de Janeiro: Galo Branco, pp. 91-105.

2005-

2005. "Symbolic Natural Language Processing", in Applied Combinatorics on Words, Lothaire, Cambridge University Press, pp. 164-209. PS (2,1 Mb).

2005. "Une classe d'adjectifs de localisation", in Cahiers de lexicologie 86, Les adjectifs non prédicatifs, Paris: Garnier, pp. 145-161.

2005. "Lexicon management and standard formats", Archives of Control Sciences 15:3, pp. 329-340; also in Proceedings of the Language and Technology Conference, Poznan (Poland): University Adam Mickiewicz, pp. 318-322.

2005. "In Memoriam Maurice Gross", Archives of Control Sciences 15:3, pp. 257-278; invited talk, Language and Technology Conference, abstract on p. XX of the Proceedings, Poznan (Poland): University Adam Mickiewicz.

Marcelo C.M. Muniz, Maria das Graças V. Nunes, Eric Laporte, 2005. "UNITEX-PB, a set of flexible language resources for Brazilian Portuguese", Proceedings of the Workshop on Technology of Information and Human Language (TIL), São Leopoldo (Brazil) : Unisinos, pp. 2059-2068.

Hyun-gue HUH, Eric Laporte, 2005. "A Resource-Based Korean morphological annotation system", Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP) , Jeju (Korea), pp. 37-42.

Hyun-gue HUH, Éric Laporte. 2005. "Morphological annotation of Korean with Maintainable Resources", Dicora Annual Meeting, Hankuk University of Foreign Studies, Seoul, pp. 14-18.

2005. "Graphes paramétrés et lexique-grammaire", talk given at the Atala Workshop "Interface lexique-grammaire et lexiques syntaxiques et sémantiques", 12 March 2005.

Duško Vitas, Cvetana Krstev, Éric Laporte. 2006. "Preparation and exploitation of bilingual texts", Lux Coreana 1, Paris: Han-Seine, pp. 110-132.

Maria Carmelita P. Dias, Éric Laporte, Christian Leclère. 2006. "Verbs with very strictly selected complements", Collocations and Idioms: The First Nordic Conference on Syntactic Freezes, University of Joensuu, Finland.

Ivan Berlocher, Hyun-gue HUH, Éric Laporte, Jee-sun NAM. 2006. "Morphological annotation of Korean with Directly Maintainable Resources", Poster, Proceedings of LREC, Genoa.

Olivier Blanc, Matthieu Constant, Éric Laporte. 2006. "Outilex, plate-forme logicielle de traitement de textes écrits", Verbum ex machina. Proceedings of TALN, Cahiers du Cental Series, 2(1), Presses universitaires de Louvain, pp. 83-92.

Éric Laporte, Sébastien Paumier. 2006. "Graphes paramétrés et outils de lexicalisation", Poster, Verbum ex machina. Proceedings of TALN, Cahiers du Cental Series, 2(1), Presses universitaires de Louvain, pp. 532-540.

Éric Laporte, Christian Leclère, Maria Carmelita P. Dias. 2006. "Very strict selectional restrictions. A Comparison between Portuguese and French", Proceedings of the Workshop on Computational Processing of Written and Spoken Portuguese (PROPOR), Itatiaia (RJ), Brazil, LNCS 3960, Springer, pp. 225-228.

2006. "Methodological provisions in the construction of idiom resources", invited talk, "Collocations and idioms 2006: linguistic, computational, and psycholinguistic perspectives", 3 November 2006, Berlin-Brandenburg Academy of Sciences and Humanities.

2007. "Extension of a Grammar of French Determiners", Proceedings of the 26th International Conference on Lexis and Grammar, Bonifacio, Camugli, Constant, Dister (eds.), p. 65-72.

2007. "Evaluation of a Grammar of French Determiners", Annals of the 27th Congress of the Brazilian Society of Computation, Workshop on Information Technology and Human Language (TIL), Rio de Janeiro.

2007. Foreword of Lexicon-Grammar of Korean Adjectives (in Korean), by Nam Jee-sun, Seoul: Hankookmunhwasa, pp. 17-19.

2008 (to appear). "Words and lexical level", in J.M. Pierrel, ed., Language engineering. London: ISTE. Translation of Laporte (2000).

2008. "Exemples attestés et exemples construits dans la pratique du lexique-grammaire", Mémoires de la Société de linguistique de Paris, .Nouvelle série, vol. 16. Observations et manipulations en linguistique: entre concurrence et complémentarité, edited by Jacques François. Louvain/Paris/Dudley: Peeters, pp. 11–32. Tall given in Paris in January 2007. ISBN 978-90-429-2161-0.

My other publications are listed on the pages of the Computational Linguistics Group of LIGM.

Eric Laporte

Abstracts

Danlos, Laurence, Françoise Emerard, Eric Laporte, 1986. "Synthesis of Spoken Messages from Semantic Representations (Semantic-Representation-to-Speech System)", Proceedings of Coling 1986, Bonn, pp. 599-604. doi:10.3115/991365.991540

Abstract. A semantic-representation-to-speech system communicates orally the information given in a semantic representation. Such a system must integrate a text generation module, a phonetic conversion module, a prosodic module and a speech synthesizer. We will see how the syntactic information elaborated by the text generation module is used for both phonetic conversion and prosody, so as to produce the data that must be supplied to the speech synthesizer, namely a phonetic chain including prosodic information.

Keywords: natural language, phonetics, text generation, speech synthesis.

1987. "Prise en compte des variations phonétiques en reconnaissance de la parole", Actes des 16es Journées d'étude sur la parole, Société française d'acoustique, Hammamet, pp.153-156.

Abstract. This paper deals with ways of taking into account phonetic variations in speech recognition systems. Several recognition methods are considered. Particular emphasis is placed on recognition systems, based on pattern-matching, in which the decision unit is the fraction of speech between two adjacent syllabic centres. The phonetic data involved in this method include a list of references, which should contain variants. Such a method underlines the applicative interest of describing variants precisely ans sytematically. As an example of such a description, some phonetic alternations related to hiatuses in French are studied in detail.

Keywords: natural language, phonetics, phonology, speech recognition.

1994. "Experiments in Lexical Disambiguation Using Local Grammars", Papers in Computational Lexicography, COMPLEX '94, Ferenc Kiefer, Gabor Kiss and Julia Pajzs eds., Budapest: Linguistics Institute of the Hungarian Academy of Sciences, pp.163-172.

Abstract. Lexical disambiguation is one of the major challenges facing those who devise automatic word tagging systems for processing written text. Grammatical disambiguation algorithms reduce the number of possible tags. We will consider here a framework where a large grammatical lexicon is looked up to associate every token in the text, either a simple or a compound word, with the set of all grammatical tags a priori possible for it. (Such a framework for French is now integrated into the INTEX system.) This problem was investigated by M. Silberztein (1989) and E. Roche (1992). We provide formal descriptions of both algorithms. They share a striking common background and purpose. However, they show real formal and computational differences. From a formal point of view, we compare the formal power of the algorithms. From a practical point of view, we examine whether the algorithms are better adapted to particular types of grammatical disambiguation.

Keywords: natural language, lexical analysis, lexical ambiguity, finite-state automata.

1996. "Context-free parsing with finite-state transducers", in Proceedings of the 3rd South American Workshop on String Processing, N. Ziviani et al. (eds.), International Informatics Series 4, Montréal: McGill-Queen's University Press; and Ottawa: Carleton University Press, pp. 171-182. 190 Kb.

Abstract. This article is a study of an algorithm designed and implemented by Roche for parsing natural language sentences according to a context-free grammar. This algorithm is based on the construction and use of a finite-state transducer. Roche successfully applied it to a context-free grammar with very numerous rules. In contrast, the complexity of parsing words according to context-free grammars is usually considered in practice as a function of one parameter: the length of the input sequence; the size of the grammar is generally taken to be a constant of a reasonable value. In this article, we first explain why a context-free grammar with a correct lexical and grammatical coverage is bound to have a very large number of rules and we review work related with this problem. Then we exemplify the principle of Roche's algorithm on a small grammar. We provide formal definitions of the construction of the parser and of the operation of the algorithm and we prove that the parser can be built for a large class of context-free grammars, and that it outputs the set of parsing trees of the input sequence.

Keywords: natural language, parsing, finite-state automata, context-free grammars.

Eric Laporte, Anne Monceaux, 1997. Grammatical disambiguation of French words using part of speech, inflectional features and lemma of words in the context. GRAMLEX report no. 3D2, 11 p.

Abstract.We describe ELAG (Elimination of lexical ambiguity with grammars), a new system of lexical disambiguation using grammatical information about words in the context.The disambiguation takes place after a lexical analysis of input text, but before syntactic parsing. The linguistic data of the disambiguator are organised in separate, compact, readable modules, that we call disambiguation grammars. The respective effects of several disambiguation grammars on an input text are independent of each other. This feature of the disambiguation is mathematically guaranteed by the formula used to apply grammars to sentences. The effects of disambiguation grammars are cumulative: if one writes new grammars and uses them with existing ones, the effect of the existing grammars is not modified. Different grammars can apply to the same sequence, or to overlapping sequences, or to sequences included in other sequences: their effects are cumulative. The order of application of grammars is indifferent. The effects of a grammar on various analyses of a sentence are independent. ELAG is INTEX-compatible.

Keywords: natural language, lexical ambiguity, finite-state automata.

1997. "Rational Transductions for Phonetic Conversion and Phonology", in E. Roche and Y.Schabès eds., Finite-State Language Processing, chap. 14. Language, Speech and Communication series. Cambridge: MIT Press, pp. 407-429. 415 Kb.

Abstract. Phonetic conversion, and other conversion problems related to phonetics, can be performed by finite-state tools. This chapter presents a finite-state conversion system, BiPho, based on tranducers and bimachines. The linguistic data used by this system are described in a readable format and actual computation is efficient. The system constitutes a spelling-to-phonetics conversion system for French.

Keywords: natural language, phonetics, finite-state automata.

1995. "Appropriate nouns with obligatory modifiers", Language Research 31(2), Seoul National University, ISSN 0254-4474, pp. 251-289. Presented at the 4th Korean-French Conference on Grammar and the Lexicon, National University of Seoul, 1994. French version in Langages 126.

Abstract. The notion of appropriate sequence as introduced by Z. Harris provides a powerful syntactic way of analysing the detailed meaning of various sentences, including ambiguous ones. In an adjectival sentence like The leather was yellow, the introduction of an appropriate noun, here colour, specifies which quality the adjective describes. In some other adjectival sentences with an appropriate noun, that noun plays the same part as colour and seems to be relevant to the description of the adjective. These appropriate nouns can usually be used in elementary sentences like The leather had some colour, but in many cases they have a more or less obligatory modifier. For example, you can hardly mention that an object has some colour without qualifying that colour at all. About 300 French nouns are appropriate in at least one adjectival sentence and have an obligatory modifier. They enter in a number of sentence structures related by several syntactic transformations. The appropriateness of the noun and the fact that the modifier is obligatory are reflected in these transformations. The description of these syntactic phenomena provides a basis for a classification of these nouns. It also concerns the lexical properties of thousands of predicative adjectives, and in particular the relations between the sentence without the noun: The leather was yellow and the adjectival sentence with the noun: The colour of the leather was yellow.

Keywords: lexicon-grammar, syntax, lexicology.

1997. "Les Mots. Un demi-siècle de traitements", Traitement automatique des langues (t.a.l.) 38(2), État de l'art, Paris: ATALA, pp. 47-68. [INIST link]

Abstract. We survey those domains of natural language processing where the notion of word can be considered as the fundamental unit. We examine the results aimed at, the resultsachieved, the data acquired and the methods used in these domains. Our ambition is that this critical evaluation could contribute to orientate research and development effort towads practical results.

Keywords: natural language.

1998. "Lexical disambiguation with fine-grained tagsets", in J. Ginzburg et al., ed., The Tbilisi Symposium in Logic, Language and Computation: Selected Papers. 19-22 October 1995, Gudauri, Georgia. Studies in Logic, Language and Information, Cambridge: Cambridge University Press; and Stanford: CSLI and FoLLI, pp. 203-210.

Abstract. We describe the mathematical models underlying two constraint-based, finite-state methods for lexical disambiguation with fine-grained tagsets. They are more powerful variants of the methods described by Roche 1992 and Silberztein 1993. Both have the full theoretical expressive power of finite-state devices.

Keywords: natural language, lexical ambiguity, finite-state automata.

Strahil Ristov, Éric Laporte, 1999. "Ziv Lempel Compression of Huge Natural Language Data Tries Using Suffix Arrays", in LNCS 1645, Combinatorial Pattern Matching, 10th Annual Symposium, Warwick University, UK, July 1999, Proceedings, M. Crochemore, M. Paterson., eds., Berlin: Springer, pp. 196-211. 949 Kb.

Abstract. We present a very efficient data structure, in terms of space and access speed, for storing huge natural language data sets. The structure is described as a Ziv Lempel compressed linked list trie and is a step further beyond directed acyclic word graph in automata compression. We are using the structure to store DELAF, a huge French lexicon with syntactical, grammatical and lexical information associated with each word. The compressed structure can be produced in O(N) time using suffix trees for finding repetitions in trie. For large data sets, space requirements are more prohibitive than time, so suffix arrays are used instead, with compression time complexity O(N log N) for all but for the largest data sets.

Keywords: natural language, data compression, natural language.

Éric Laporte, Anne Monceaux, 1999. "Elimination of lexical ambiguities by grammars. The ELAG system", Lingvisticae Investigationes XXII, Amsterdam-Philadelphie : Benjamins, pp. 341-367. Ingenta link. RTF (1 Mb).

Abstract. We present a new, INTEX-compatible formalism for the description of distributional constraints, ELAG (Elimination of lexical ambiguity by grammars). The constraints may be checked against text, and the lexical ambiguity of the text may thus be partly resolved. We describe and exemplify the main properties of ELAG with the aid of simple rules, formalizing exploitable constraints. We specify in detail the effect of applying an ELAG rule or grammar to a text. We examine the practical properties of the formalism from the point of view of a rule writer. We describe our evaluation procedure for the lexical disambiguation results.

Keywords: natural language, lexical ambiguity, finite-state automata.

2001. "Reduction of lexical ambiguity", Lingvisticae Investigationes XXIV:1, Amsterdam-Philadelphie : Benjamins, pp. 67-103. RTF.

Abstract. We examine various issues faced during the elaboration of lexical disambiguators, e.g. issues related with linguistic analyses underlying disambiguators, and we exemplify these issues with grammatical constraints. We also examine computational problems: the influence of the granularity of tagsets, the definition of realistic and useful objectives, and the construction of the data required for the reduction of ambiguity; and we study how they are connected with linguistic problems. We show why a formalism is required for automatic ambiguity reduction, we analyse its function and we present a typology of such formalisms.

Keywords: natural language, lexical ambiguity.

2005. "Une classe d'adjectifs de localisation", in Cahiers de lexicologie 86, Les adjectifs non prédicatifs, Paris: Garnier, pp. 145-161.

Abstract. We propose a homogeneous class of French location adjectives, ADJLOC, and a lexicon-grammar approach to their description. The adjectives are those which never constitute a predicate with a support verb, and optionally or obligatorily occur in free sentences like This is the south front of the house. ADJLOC's admit various other syntactic constructions. Thus, some of them occur in a sentence with have related to a sentence with a locative preposition: the car has a rear bumper, the car has a bumper in its rear part. Two nominalization relations lead to nominal constructions: this is the central area of the screen, this is the centre of the screen, this is the area of the centre of the screen. The constructions discussed in this article are represented in a table of syntactic properties.

Keywords: lexicology, adjective, location.

2005. "Lexicon management and standard formats", Archives of Control Sciences 15:3, pp. 329-340; also in Proceedings of the Language and Technology Conference, Poznan (Poland) : University Adam Mickiewicz, pp. 318-322.

Abstract. International standards for lexicon formats are in preparation. To a certain extent, the proposed formats converge with prior results of standardization projects. However, their adequacy for (i) lexicon management and (ii) lexicon-driven applications have been little debated in the past, nor are they as a part of the present standardization effort. We examine these issues. IGM has developed XML formats compatible with the emerging international standards, and we report experimental results on large-coverage lexicons.

Keywords: language resource, lexicon management, standardization, inflection, morphology.

Marcelo C.M. Muniz, Maria das Graças V. Nunes, Eric Laporte, 2005. "UNITEX-PB, a set of flexible language resources for Brazilian Portuguese", in Proceedings of the Workshop on Technology on Information and Human Language (TIL), São Leopoldo (Brazil): Unisinos, pp. 2059-2068.

Abstract. This work documents the project and development of various computational linguistic resources that support the Brazilian Portuguese language according to the formal methodology used by the corpus processing system called UNITEX. The delivered resources include computational lexicons, libraries to access compressed lexicons, and additional tools to validate those resources.

Keywords: language resource, lexicon management, inflection, morphology.

Hyun-gue HUH, Eric Laporte, 2005. "A Resource-Based Korean morphological annotation system", in Companion to the Proceedings of the International Joint Conference on Natural Language Processing, Jeju (Korea), pp. 37-42.

Abstract. We describe a resource-based method of morphological annotation of written Korean text. Korean is an agglutinative language. The output of our system is a graph of morphemes annotated with accurate linguistic information. The language resources used by the system can be easily updated, which allows users to control the evolution of the performances of the system. We show that morphological annotation of Korean text can be performed directly with a lexicon of words and without morphological rules.

Keywords: language resource, Korean, annotation, morphology, agglutinative language.

Ivan Berlocher, Hyun-gue HUH, Eric Laporte, Jee-sun NAM. 2006. "Morphological annotation of Korean with Directly Maintainable Resources", in Proceedings of LREC, Genoa.

Keywords: language resource, evaluation, Korean, annotation, morphology, agglutinative language.

Olivier Blanc, Matthieu Constant, Éric Laporte, 2006. "Outilex, plate-forme logicielle de traitement de textes écrits", Verbum ex machina. Proceedings of TALN, Cahiers du Cental Series, 2(1), Presses universitaires de Louvain, pp. 83-92.

Abstract. The Outilex software platform, which will be made available to research, development and industry, comprises software components implementing all the fundamental operations of written text processing: processing without lexicons, exploitation of lexicons and grammars, language resource management. All data are structured in XML formats, and also in more compact formats, either readable or binary, whenever necessary; the required format converters are included in the platform; the grammar formats allow for combining statistical approaches with resource-based approaches. Manually constructed lexicons for French and English, originating from the LADL, and of substantial coverage, will be distributed with the platform under LGPL-LR license.

Keywords: lexical tagging, linguistic resource, lexicon, grammar, finite automaton, XML.

Éric Laporte, Sébastien Paumier, 2006. "Graphes paramétrés et outils de lexicalisation", Poster, Verbum ex machina. Proceedings of TALN, Cahiers du Cental Series, 2(1), Presses universitaires de Louvain, pp. 532-540. — HAL link.

Abstract. Shifting to a lexicalized grammar reduces the number of parsing errors and improves application results. However, such an operation affects a syntactic parser in all its aspects. One of our research objectives is to design a realistic model for grammar lexicalization. We carried out experiments for which we used a grammar with a very simple content and formalism, and a very informative syntactic lexicon, the lexicon-grammar of French elaborated by the LADL. Lexicalization was performed by applying the parameterized-graph approach. Our results tend to show that most information in the lexicon-grammar can be transferred into a grammar and exploited successfully for the syntactic parsing of sentences.

Keywords: lexicalisation, parser, syntactic parsing, French, lexicon-grammar.

Maria Carmelita P. Dias, Éric Laporte, Christian Leclère, 2006. "Verbs with very strictly selected complements", Collocations and Idioms: The First Nordic Conference on Syntactic Freezes, University of Joensuu, Finland.

Abstract. We discuss the characteristics and behaviour of two parallel classes of verbs in two Romance languages, French and Portuguese. Examples of these verbs are Port. abater [gado] and Fr. abattre [bétail], both meaning "slaughter [cattle]". In both languages, the definition of the class of verbs includes several features:
- They have only one essential complement, which is a direct object.
- The nominal distribution of the complement is very limited, i.e., few nouns can be selected as head nouns of the complement. However, this selection is not restricted to a single noun, as would be the case for verbal idioms such as Fr. monter la garde "mount guard".
- We excluded from the class constructions which are reductions of more complex constructions, e.g. Port. afinar [instrumento] com "tune [instrument] with".

Keywords: multi-word expressions, syntax, French, Portuguese, lexicon-grammar.

Éric Laporte, 2007. "Evaluation of a Grammar of French Determiners", Annals of the 27th Congress of the Brazilian Society of Computation, Workshop on Information Technology and Human Language (TIL), Rio de Janeiro.

Abstract. Existing syntactic grammars of natural languages, even with a far from complete coverage, are complex objects. Assessments of the quality of parts of such grammars are useful for the validation of their construction. We evaluated the quality of a grammar of French determiners that takes the form of a recursive transition network. The result of the application of this local grammar gives deeper syntactic information than chunking or information available in treebanks. We performed the evaluation by comparison with a corpus independently annotated with information on determiners. We obtained 86% precision and 92% recall on text not tagged for parts of speech.

Keywords : determiner, definite, indefinite, quantity, syntax, French, grammar, local grammar, evaluation, annotated corpus.

The evaluation corpus

2008 (to appear). "Exemples attestés et exemples construits dans la pratique du lexique-grammaire", Mémoires de la Société de linguistique de Paris. Louvain/Paris/Dudley: Peeters.

Abstract. Croft (1993) contrasts an ‘experimental method’ with an ‘observational method’, thus renewing the discussion between introspective linguistics and corpus linguistics, by suggesting a parallel with experimental sciences, which these terms come from. The example of lexicon-grammar, a method of syntactic-semantic description constructed with explicit reference to experimental sciences, confirms that formulating rules in accordance with the real usage of a language is not only a matter of observing examples, but also that it nevertheless requires intensive observation of examples, as well as rigorous methodological precautions in this observation. Thus, the apparently opposed traditions of introspective linguistics and of corpus linguistics are complementary and should be combined for the success of such an enterprise. These thoughts are an invitation for linguists to overcome their historical resistance to combining both types of methods. Similarly, in natural language processing, most of the community sticks to the stochastic approach, which amounts to giving up co-operation between computer technology and descriptive linguistics.

Keywords: corpus linguistics, introspection.

Eric Laporte