Data Set Used
This article presents an overview of the shared task that took place as part of the TweetMT workshop held at SEPLN 2015. The task consisted in translating collections of tweets from and to several languages. The article outlines the data collection and annotation process, the development and evaluation of the shared task, as well as the results achieved by… (More)
This article presents a summary of the TweetLID shared task and workshop held at SEPLN 2014. It briefly summarizes the data collection and annotation process, the development and evaluation of the shared task, as well as the results achieved by the participants.
An overview of the shared task is presented: description, corpora, annotation , preprocess, participant systems and results. * Gracias a todos los miembros del Comité de Or-ganización y a los proyectos Tacardi, Xlike, Celtic, TextMESS2 y Skater por su colaboración.
This paper argues in favor of a linguistically-informed error classification for SMT to identify system weaknesses and map them to possible syntactic, semantic and structural fixes. We propose a scheme which includes both linguistic-oriented error categories as well as SMT-oriented edit errors, and evaluate an English-Spanish system and an English Basque… (More)
In this paper we introduce TweetNorm es, an annotated corpus of tweets in Spanish language, which we make publicly available under the terms of the CC-BY license. This corpus is intended for development and testing of microtext normalization systems. It was created for Tweet-Norm, a tweet normalization workshop and shared task, and is the result of a joint… (More)