Learn More
1 Overview of the ARISE system for spoken language information retrieval. 3 Example query and result of semantic analysis after literal and contextual understanding 9 4 Example dialog illustrating direct feedback of what has been understood by the system. Abstract The LIMSI ARISE system provides vocal access by telephone to rail travel information for main(More)
Within the framework of the construction of a fact database, we defined guidelines to extract named entities, using a taxonomy based on an extension of the usual named entities definition. We thus defined new types of entities with broader coverage including substantive-based expressions. These extended named entities are hierarchical (with types and(More)
The evaluation of named entity recognition (NER) methods is an active field of research. This includes the recognition of named entities in speech transcripts. Evaluating NER systems on automatic speech recognition (ASR) output whereas human reference annotation was prepared on clean manual transcripts raises difficult alignment issues. These issues are(More)
This paper presents and reports on the progress of the EVALDA/MEDIA project, focusing on the recording and annotating protocol of the reference dialogue corpus. The aim of this project is to design and test an evaluation methodology to compare and diagnose the context-dependent and independent understanding capability of spoken language dialogue systems.(More)
This paper presents a new paradigm of " challenge " evaluation of Spoken Language Understanding. This methodology aims at a quantitative assessment with a high diagnostic power, by opposition with standard ATIS-like frameworks. This paper details the methodology as well as the results of an evaluation campaign held by the French CNRS research agency. The(More)
We present in this paper the three LIMSI question-answering systems on speech transcripts which participated to the QAst 2009 evaluation. These systems are based on a complete and multi-level analysis of both queries and documents. These systems use an automatically generated research descriptor. A score based on those descriptors is used to select(More)
We focus in this paper on the named entity recognition task in spoken data. The proposed approach investigates the use of various contexts of the words to improve recognition. Experimental results carried out on speech data from French broadcast news, using conditional random fields (CRF) show that the use of semantic information, generated using symbolic(More)
In this paper, we present a Conditional Random Field based approach for automatic detection of edit disfluencies in a conversational telephone corpus in French. We define dis-fluency patterns using both linguistic and acoustic features to perform disfluency detection. Two related tasks are considered : the first task aims at detecting the disfluent speech(More)
Within the framework of the Quaero project, we proposed a new definition of named entities, based upon an extension of the coverage of named entities as well as the structure of those named entities. In this new definition, the extended named entities we proposed are both hierarchical and compositional. In this paper, we focused on the annotation of a(More)