• Publications
  • Influence
Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments
A tagset is developed, data is annotated, features are developed, and results nearing 90% accuracy are reported on the problem of part-of-speech tagging for English data from the popular micro-blogging service Twitter. Expand
From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series
We connect measures of public opinion measured from polls with sentiment measured from text. We analyze several surveys on consumer confidence and political opinion over the 2008 to 2009 period, andExpand
Annotation Artifacts in Natural Language Inference Data
It is shown that a simple text categorization model can correctly classify the hypothesis alone in about 67% of SNLI and 53% of MultiNLI, and that specific linguistic phenomena such as negation and vagueness are highly correlated with certain inference classes. Expand
Linguistic Knowledge and Transferability of Contextual Representations
It is found that linear models trained on top of frozen contextual representations are competitive with state-of-the-art task-specific models in many cases, but fail on tasks requiring fine-grained linguistic knowledge. Expand
Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters
This work systematically evaluates the use of large-scale unsupervised word clustering and new lexical features to improve tagging accuracy on Twitter and achieves state-of-the-art tagging results on both Twitter and IRC POS tagging tasks. Expand
A Simple, Fast, and Effective Reparameterization of IBM Model 2
We present a simple log-linear reparameterization of IBM Model 2 that overcomes problems arising from Model 1’s strong assumptions and Model 2’s overparameterization. Efficient inference, likelihoodExpand
Transition-Based Dependency Parsing with Stack Long Short-Term Memory
This work was sponsored in part by the U. S. Army Research Laboratory and the NSF CAREER grant IIS-1054319 and the European Commission. Expand
Recurrent Neural Network Grammars
We introduce recurrent neural network grammars, probabilistic models of sentences with explicit phrase structure. We explain efficient inference procedures that allow application to both parsing andExpand
A Latent Variable Model for Geographic Lexical Variation
A multi-level generative model that reasons jointly about latent topics and geographical regions is presented, which recovers coherent topics and their regional variants, while identifying geographic areas of linguistic consistency. Expand
A Discriminative Graph-Based Parser for the Abstract Meaning Representation
The first approach to parse sentences into meaning representation, a semantic formalism for which a grow- ing set of annotated examples is available, is introduced, providing a strong baseline for improvement. Expand