• Publications
  • Influence
CamemBERT: a Tasty French Language Model
This paper investigates the feasibility of training monolingual Transformer-based language models for other languages, taking French as an example and evaluating their language models on part-of-speech tagging, dependency parsing, named entity recognition and natural language inference tasks.
Universal Dependencies 2.1
The annotation scheme is based on (universal) Stanford dependencies, Google universal part-of-speech tags, and the Interset interlingua for morphosyntactic tagsets for morpho-lingual tagsets.
Controllable Sentence Simplification
A discrete parametrization mechanism that provides explicit control on simplification systems based on Sequence-to-Sequence models is adapted, which establishes the state of the art at 41.87 SARI on the WikiLarge test set, a +1.42 improvement over the best previously reported score.
Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages
This paper presents and analyzes parsing results obtained by the task participants, and provides an analysis and comparison of the parsers across languages and frameworks, reported for gold input as well as more realistic parsing scenarios.
International standard for a linguistic annotation framework
The outline of a linguistic annotation framework under development by ISO TC37 SC WG1-1, which will provide an architecture for the creation, annotation, and manipulation of linguistic resources and processing software, is described.
Multilingual Unsupervised Sentence Simplification
This work proposes using unsupervised mining techniques to automatically create training corpora for simplification in multiple languages from raw Common Crawl web data, and shows that by training on mined data rather than supervised corpora, this approach outperform the previous best results.
The Lefff 2 syntactic lexicon for French: architecture, acquisition, use
In this paper, we introduce a new lexical resource for French which is freely available as the second version of the Lefff (Lexique des formes fléchies du français - Lexicon of French inflected
Comment obtenir plus des Méta-Grammaires
Cet article présente un environnement de développement pour les méta-grammaires (MG), utilisé pour concevoir rapidement une grammaire d’arbres adjoints (TAG) du français à large couverture et
From metagrammars to factorized TAG/TIG parsers
This document shows how the factorized syntactic descriptions provided by Meta-Grammars coupled with factorization operators may be used to derive compact large coverage tree adjoining grammars.
Automates a piles et programmation dynamique dyalog : une application a la programmation en logique
La motivation premiere de ce travail est la realisation d'un evaluateur de programmes logiques respectant le paradigme declaratif de la programmation en logique (principalement la completude des