• Publications
  • Influence
ISO-TimeML: An International Standard for Semantic Annotation
In this paper, we present ISO-TimeML, a revised and interoperable version of the temporal markup language, TimeML. We describe the changes and enrichments made, while framing the effort in a moreExpand
Standards going concrete : from LMF to Morphalou
The ongoing activity within ISO/TC 37/SC 4 on LMF (Lexical Markup Framework) is described and it is shown how it can be concretely implemented for the design of an on-line morphological resource for French in the Morphalou project. Expand
XCES: An XML-based Encoding Standard for Linguistic Corpora
This paper instantiated the CES as an XML application called XCES, based on the same data architecture comprised of a primary encoded text and "standoff" annotation in separate documents, and demonstrated how XML mechanisms can be used to select from and manipulate annotated corpora encoded according toXCES specifications. Expand
HUMB: Automatic Key Term Extraction from Scientific Articles in GROBID
The Semeval task 5 was an opportunity for experimenting with the key term extraction module of GROBID, a system for extracting and generating bibliographical information from technical and scientific documents, andagged decision trees appeared to be the most efficient machine learning algorithm for generating a list of ranked key term candidates. Expand
International standard for a linguistic annotation framework
The outline of a linguistic annotation framework under development by ISO TC37 SC WG1-1, which will provide an architecture for the creation, annotation, and manipulation of linguistic resources and processing software, is described. Expand
Representing Linguistic Corpora and Their Annotations
Some of the more technical aspects of the LAF design that have been addressed in the process of finalizing the specifications for the standard are described. Expand
Veins Theory: A Model of Global Discourse Cohesion and Coherence
A generalization of Centering Theory (CT) (Grosz, Joshi, Weinstein (1995) called Veins Theory (VT) is proposed, which extends the applicability of centering rules from local to global discourse. Expand
Experiments with Citation Mining and Key-Term Extraction for Prior Art Search
This technical note presents the system built for the IP track of CLEF 2010 based on PATATRAS, the modular search infrastructure initially realized for CLEF IP 2009, and considers that an instance-based KNN algorithm is not competitive with standard classifiers based on preliminary large scale training. Expand
A model oriented approach to the mapping of annotation formats using standards
SALT, a framework for mapping heterogeneous linguistic formats from one another based on a model-based approach, is presented and its capacity to integrate a wide range of possible linguistic annotation models is shown. Expand
Towards International Standards for Language Resources
The use of the LAF to represent the American National Corpus and its linguistic annotations is described, which is to serve as a basis for harmonizing existing language resources, as well as developing new ones. Expand