Data Set Used
This research compares several of the thematic roles of Verb Net (VN) to those of the Linguistic Infrastructure for Interoperable Resources and Systems (LIRICS). The purpose of this comparison is to develop a standard set of thematic roles that would be suited to a variety of natural language processing (NLP) applications. We draw from both resources to… (More)
In this paper we report on the analyses of alternative approaches to semantic role annotation (FrameNet (FrameNet, 2005), PropBank (Palmer et al., 2005) and VerbNet (Kipper, 2006)) with respect to their models of description; granularity of semantic role sets; definitions of semantic roles concepts; and consistency and reliability of annotations, and we… (More)
This paper describes an ISO project developing an international standard for annotating dialogue with semantic information, in particular concerning the communicative functions of the utterances, the kind of content they address, and the dependency relations to what was said and done earlier in the dialogue. The project, registered as ISO 24617-2 Semantic… (More)
This paper presents a machine learning-based approach to the incremental understanding of dialogue utterances, with a focus on the recognition of their communicative functions. A token-based approach combining the use of local classifiers, which exploit local utterance features, and global classifiers which use the outputs of local classifiers applied to… (More)
This paper presents empirical evidence for the orthogonality of the DIT ++ multidimensional dialogue act annotation scheme, showing that the ten dimensions of communication which underlie this scheme are addressed independently in natural dialogue.
We describe the preparation of parallel corpora based on professional quality subtitles in seven European language pairs. The main focus is the effect of the processing steps on the size and quality of the final corpora.
This paper describes the data collection and parallel corpus compilation activities carried out in the FP7 EU-funded SUMAT project. This project aims to develop an online subtitle translation service for nine European languages combined into 14 different language pairs. This data provides bilingual and monolingual training data for statistical machine… (More)
This paper summarizes the latest, final version of ISO standard 24617-2 " Semantic annotation framework, Part 2: Dialogue acts ". Compared to the preliminary version ISO DIS 24617-2:2010, described in Bunt et al. (2010), the final version additionally includes concepts for annotating rhetorical relations between dialogue units, defines a full-blown… (More)
ROCKIT is a strategic roadmapping action in the area of multimodal conversational interaction technologies funded as a support action by the EU during 2014 and 2015. We envisage a future in which human-human, human-machine, and human-environment communication are not hampered by differences in language capability, accessibility, or knowledge of the… (More)
The literature contains a wealth of theoretical and empirical analyses of discourse marker functions in human communication. Some of these studies address the phenomenon that discourse markers are often multifunctional in a given context, but do not study this in systematic and formal ways. In this paper we show that the use of multiple dimensions in… (More)