Learn More
User-generated texts such as reviews, comments or discussions are valuable indicators of users' preferences. Unlike previous works which focus on labeled data from user-contributed reviews, we focus here on user comments which are not accompanied by explicit rating labels. We investigate their utility for a one-class collaborative filtering task such as(More)
This article defines a Framework for Machine Translation Evaluation (FEMTI) which relates the quality model used to evaluate a machine translation system to the purpose and context of the system. Our proposal attempts to put together, into a coherent picture, previous attempts to structure a domain characterised by overall complexity and local difficulties.(More)
We propose a method for computing semantic relatedness between words or texts by using knowledge from hypertext encyclopedias such as Wikipedia. A network of concepts is built by filtering the encyclopedia's articles, each concept corresponding to an article. Two types of weighted links between concepts are considered: one based on hyperlinks between the(More)
The AMIDA Automatic Content Linking Device (ACLD) is a just-in-time document retrieval system for meeting environments. The ACLD listens to a meeting and displays information about the documents from the group's history that are most relevant to what is being said. Participants can view an outline or the entire content of the documents , if they feel that(More)
This article describes an experiment in user query elicitation for the design of a multimodal meeting processing and retrieval system (MPR). In the experiment, participants are asked to choose between several scenarios of use of an MPR system, then formulate (on paper) queries to the system within the context of their chosen scenario. The analysis of the(More)
We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction. This is a classification task in which participants are asked to provide predictions on what pronoun class label should replace a placeholder value in the target-language text, provided in lemma-tised and PoS-tagged form. We(More)
This paper shows how the disambiguation of discourse connectives can improve their automatic translation, while preserving the overall performance of statistical MT as measured by BLEU. State-of-the-art automatic classi-fiers for rhetorical relations are used prior to MT to label discourse connectives that signal those relations. These labels are used for(More)
—This paper introduces a new dataset and compares several methods for the recommendation of non-fiction audiovisual material, namely lectures from the TED website. The TED dataset contains 1,149 talks and 69,023 profiles of users, who have made more than 100,000 ratings and 200,000 comments. This data set, which we make public, can be used for training and(More)
This paper describes an ISO project developing an international standard for annotating dialogue with semantic information, in particular concerning the communicative functions of the utterances, the kind of content they address, and the dependency relations to what was said and done earlier in the dialogue. The project, registered as ISO 24617-2 Semantic(More)
In this paper we discuss the use of multi-layered tagsets for dialogue acts, in the context of dialogue understanding for multi-party meeting recording and retrieval applications. We discuss some desiderata for such tagsets and critically examine some previous proposals. We then define MAL-TUS, a new tagset based on the ICSI-MR and Switchboard tagsets,(More)