Learn More
We first describe the automatic conversion of the French Treebank (Abeillé and Barrier, 2004), a constituency treebank, into typed projective dependency trees. In order to evaluate the overall quality of the resulting dependency treebank, and to quantify the cases where the projectivity constraint leads to wrong dependencies, we compare a subset of the(More)
We present a semi-supervised method to improve statistical parsing performance. We focus on the well-known problem of lexical data sparseness and present experiments of word clustering prior to parsing. We use a combination of lexicon-aided morphological clustering that preserves tagging ambiguity, and unsuper-vised word clustering, trained on a large(More)
This paper reports results on grammatical induction for French. We investigate how to best train a parser on the French Treebank (Abeillé et al., 2003), viewing the task as a trade-off between generaliz-ability and interpretability. We compare, for French, a supervised lexicalized parsing algorithm with a semi-supervised un-lexicalized algorithm (Petrov et(More)
This paper is dedicated to the compact representation of Tree Adjoining Grammars. We provide a methodology for grammatical development with eXtensible MetaGrammar (Xmg). The provided methodology has been set up together with the development of a large French Tag. Furthermore the grammatical representation language and the assorted development methodology(More)
Testicular feminization syndrome was diagnosed in a mare with aggressive, stallion like behavior and a history of infertility. She was found to have a high baseline testosterone concentration suggesting that testicular tissue was present, and ovarian-like structures examined by use of transrectal ultrasonography had the appearance typical of testicular(More)
This paper presents preliminary investigations on the statistical parsing of French by bringing a complete evaluation on French data of the main probabilistic lexicalized and unlexicalized parsers first designed on the Penn Treebank. We adapted the parsers on the two existing treebanks of French (Abeillé et al., 2003; Schluter and van Genabith, 2007). To(More)
In this paper we introduce a general framework for describing the lexicon of a lexicalised grammar by means of elementary descriptive fragments. The system described hereafter consists of two main components: a control device aimed at controlling how fragments are to be combined together in order to describe meaningful lexical descriptions and a composition(More)
Models of the acquisition of word segmen-tation are typically evaluated using phonem-ically transcribed corpora. Accordingly, they implicitly assume that children know how to undo phonetic variation when they learn to extract words from speech. Moreover, whereas models of language acquisition should perform similarly across languages, evaluation is often(More)
In this article, we introduce eXtensible MetaGrammar (XMG), a framework for specifying tree-based grammars such as Feature-Based Lexicalised Tree-Adjoining Grammars (FB-LTAG) and Interaction Grammars (IG). We argue that XMG displays three features which facilitate both grammar writing and a fast prototyping of tree-based grammars. Firstly, XMG is fully(More)