Kepa Joseba Rodríguez

Learn More
The LUNA corpus is a multi-lingual, multidomain spoken dialogue corpus currently under development that will be used to develop a robust natural spoken language understanding toolkit for multilingual dialogue services. The LUNA corpus will be annotated at multiple levels to include annotations of syntactic, semantic, and discourse information; specialized(More)
BART (Versley et al., 2008) is a highly modular toolkit for coreference resolution that supports state-of-the-art statistical approaches and enables efficient feature engineering. For the SemEval task 1 on Coreference Resolution, BART runs have been submitted for German, English, and Italian. BART relies on a maximum entropy-based classifier for pairs of(More)
This paper presents a second release of the ARRAU dataset: a multi-domain corpus with thorough linguistically motivated annotation of anaphora and related phenomena. Building upon the first release almost a decade ago, a considerable effort had been invested in improving the data both quantitatively and qualitatively. Thus, we have doubled the corpus size,(More)
The Live Memories corpus is an Italian corpus annotated for anaphoric relations. The corpus includes manual annotated information about morphosyntactic agreement, anaphoricity, and semantic class of the NPs. For the annotation of the anaphoric links the corpus takes into account specific phenomena of the Italian language like incorporated clitics and(More)
We demonstrate the production of spoken output with contextually appropriate intonation in the information-state based dialogue system GoDiS. We exploit the context representation in the information state to determine the information structure of system utterances, which we use to control the intonation of synthesized spoken output.
In human–human dialogue, the allocation of turns between the participants is normally managed smoothly, without the participants paying much attention to it. In contrast, for spoken dialogue systems turn allocation is a difficult task, and often technical restrictions are introduced to simplify it. In this paper we investigate, by comparing two(More)
In order to keep remembering why it was judged so important to build a new Europe 'out of the crematoria of Auschwitz', the 'vital link' between Europe's past and Europe's present should, according to the British historian Tony Judt, be taught over and over again (cf. Judt 2005: 830 f.). Every generation of European historians should re-interpret the(More)
In this paper we present an active approach to annotate with lexical and semantic labels an Italian corpus of conversational human-human and Wizard-of-Oz dialogues. This procedure consists in the use of a machine learner to assist human annotators in the labeling task. The computer assisted process engages human annotators to check and correct the automatic(More)
Our goal is to improve the contextual appropriateness of spoken output in a dialogue system. We explore the use of the information state to determine the information structure of system utterances. We concentrate on the realization of information structure by intonation. We present the results of evaluating the contextual appropriateness of varied system(More)