Andrei Popescu-Belis

Learn More
This paper describes an ISO project developing an international standard for annotating dialogue with semantic information, in particular concerning the communicative functions of the utterances, the kind of content they address, and the dependency relations to what was said and done earlier in the dialogue. The project, registered as ISO 24617-2 Semantic(More)
This article defines a Framework for Machine Translation Evaluation( FEMTI) which relates the quality model used to evaluate a machinetranslation system to the purpose and context of the system. Ourproposal attempts to put together, into a coherent picture, previousattempts to structure a domain characterised by overall complexity andlocal difficulties. In(More)
We propose a method for computing semantic relatedness between words or texts by using knowledge from hypertext encyclopedias such as Wikipedia. A network of concepts is built by filtering the encyclopedia’s articles, each concept corresponding to an article. Two types of weighted links between concepts are considered: one based on hyperlinks between the(More)
User-generated texts such as reviews, comments or discussions are valuable indicators of users' preferences. Unlike previous works which focus on labeled data from user-contributed reviews, we focus here on user comments which are not accompanied by explicit rating labels. We investigate their utility for a one-class collaborative filtering task such as(More)
Many discourse connectives can signal several types of relations between sentences. Their automatic disambiguation, i.e. the labeling of the correct sense of each occurrence, is important for discourse parsing, but could also be helpful to machine translation. We describe new approaches for improving the accuracy of manual annotation of three discourse(More)
The AMIDA Automatic Content Linking Device (ACLD) is a just-in-time document retrieval system for meeting environments. The ACLD listens to a meeting and displays information about the documents from the group’s history that are most relevant to what is being said. Participants can view an outline or the entire content of the documents, if they feel that(More)
This paper introduces a new dataset and compares several methods for the recommendation of non-fiction audiovisual material, namely lectures from the TED website. The TED dataset contains 1,149 talks and 69,023 profiles of users, who have made more than 100,000 ratings and 200,000 comments. This data set, which we make public, can be used for training and(More)
In this paper we discuss the use of multilayered tagsets for dialogue acts, in the context of dialogue understanding for multiparty meeting recording and retrieval applications. We discuss some desiderata for such tagsets and critically examine some previous proposals. We then define MALTUS, a new tagset based on the ICSI-MR and Switchboard tagsets, which(More)
This paper summarizes the latest, final version of ISO standard 24617-2 “Semantic annotation framework, Part 2: Dialogue acts”. Compared to the preliminary version ISO DIS 24617-2:2010, described in Bunt et al. (2010), the final version additionally includes concepts for annotating rhetorical relations between dialogue units, defines a full-blown(More)
This article describes an experiment in user query elicitation for the design of a multimodal meeting processing and retrieval system (MPR). In the experiment, participants are asked to choose between several scenarios of use of an MPR system, then formulate (on paper) queries to the system within the context of their chosen scenario. The analysis of the(More)