Optimizing Statistical Machine Translation for Text Simplification
This work is the first to design automatic metrics that are effective for tuning and evaluating simplification systems, which will facilitate iterative development for this task.
SemEval-2015 Task 1: Paraphrase and Semantic Similarity in Twitter (PIT)
In this shared task, evaluations on two related tasks Paraphrase Identification and Semantic Textual Similarity (SS) systems for the Twitter data are presented and the importance to bringing these two research areas together is suggested.
Extracting Lexically Divergent Paraphrases from Twitter
A new model suited to identify paraphrases within the short messages on Twitter, and a novel annotation methodology that has allowed us to crowdsource a paraphrase corpus from Twitter is presented.
Discovering User Attribute Stylistic Differences via Paraphrasing
This study aims to find linguistic style distinctions across three different user attributes: gender, age and occupational class and shows their predictive power in user profiling, conformity with human perception and psycholinguistic hypotheses, and potential use in generating natural language tailored to specific user traits.
Multi-task Pairwise Neural Ranking for Hashtag Segmentation
A set of approaches for hashtag segmentation are proposed by framing it as a pairwise ranking problem between candidate segmentations and it is demonstrated that a deeper understanding of hashtag semantics obtained through segmentation is useful for downstream applications such as sentiment analysis, for which it achieved a 2.6% increase on the SemEval 2017 sentiment analysis dataset.
Who, What, When, Where, Why? Comparing Multiple Approaches to the Cross-Lingual 5W Task
An error analysis of a new cross-lingual task: the 5W task, a sentence-level understanding task which seeks to return the English 5W's corresponding to a Chinese sentence, which shows that MT significantly degrades sentence- level understanding.
