• Publications
  • Influence
Lexicon-Based Methods for Sentiment Analysis
TLDR
The Semantic Orientation CALculator (SO-CAL) uses dictionaries of words annotated with their semantic orientation (polarity and strength) to extract sentiment from text. Expand
  • 2,235
  • 163
  • PDF
Cross-Linguistic Sentiment Analysis: From English to Spanish
TLDR
We explore the adaptation of English resources and techniques for text sentiment analysis to a new language, Spanish, including machine translation and Support Vector Machine classification. Expand
  • 137
  • 14
  • PDF
Extracting sentiment as a function of discourse structure and topicality
We present an approach to extracting sentiment from texts that makes use of con- textual information. Using two dierent approaches, we extract the most relevant sentences of a text, and calculateExpand
  • 78
  • 13
  • PDF
A Syntactic and Lexical-Based Discourse Segmenter
TLDR
We present a syntactic and lexically based discourse segmenter (SLSeg) that is designed to avoid the common problem of over-segmenting text. Expand
  • 77
  • 13
  • PDF
Robust, Lexicalized Native Language Identification
TLDR
We demonstrate the efficacy of lexical features, which had previously been avoided due to the within-corpus topic confounds, and provide a detailed evaluation of various options, including bias adaptation technique and a number of classifier algorithms. Expand
  • 54
  • 7
  • PDF
Automatic Acquisition of Lexical Formality
TLDR
This paper applies information from large mixed-genre corpora, demonstrating that significant improvement is possible over simple word-length metrics, particularly when multiple sources of information, i.e. word length, word counts, and word association, are integrated. Expand
  • 47
  • 6
  • PDF
Deep-speare: A joint neural model of poetic language, meter and rhyme
TLDR
In this paper, we propose a joint architecture that captures language, rhyme and meter for sonnet modelling. Expand
  • 29
  • 6
  • PDF
Native language detection with 'cheap' learner corpora
TLDR
We begin by showing that the best publicly available, multiple-L1 learner corpus, the International Corpus of Learner English (Granger et al. 2009), has issues when used directly for the task of native language detection (NLD). Expand
  • 59
  • 5
  • PDF
Genre-Based Paragraph Classification for Sentiment Analysis
TLDR
We present a taxonomy and classification system for distinguishing between different types of paragraphs in movie reviews: formal vs. functional paragraphs and, within the latter, between description and comment. Expand
  • 67
  • 4
  • PDF
Measuring Interlanguage: Native Language Identification with L1-influence Metrics
TLDR
We introduce a method for L1 identification in second language (L2) texts that relies only on much more plentiful L1 data, rather than the L2 texts that are traditionally used for training. Expand
  • 42
  • 4
  • PDF
...
1
2
3
4
5
...