TweetNorm es Corpus: an Annotated Corpus for Spanish Microtext Normalization

Abstract

In this paper we introduce TweetNorm es, an annotated corpus of tweets in Spanish language, which we make publicly available under the terms of the CC-BY license. This corpus is intended for development and testing of microtext normalization systems. It was created for Tweet-Norm, a tweet normalization workshop and shared task, and is the result of a joint… (More)

Topics

2 Figures and Tables

Slides referencing similar topics