Learn More
Semantic word spaces have been very useful but cannot express the meaning of longer phrases in a principled way. Further progress towards understanding compositionality in tasks such as sentiment detection requires richer supervised training and evaluation resources and more powerful models of composition. To remedy this, we introduce a Sentiment Treebank.(More)
Understanding entailment and contradiction is fundamental to understanding natural language, and inference about entail-ment and contradiction is a valuable testing ground for the development of semantic representations. However, machine learning research in this area has been dramatically limited by the lack of large-scale resources. To address this, we(More)
Unsupervised vector-based approaches to semantics can model rich lexical meanings, but they largely fail to capture sentiment information that is central to many word meanings and important for a wide range of NLP tasks. We present a model that uses a mix of unsuper-vised and supervised techniques to learn word vectors capturing semantic term–document(More)
Expressives like damn and bastard have, when uttered, an immediate and powerful impact on the context. They are performative, often destructively so. They are revealing of the perspective from which the utterance is made, and they can have a dramatic impact on how current and future utterances are perceived. This, despite the fact that speakers are(More)
Vibrant online communities are in constant flux. As members join and depart, the interactional norms evolve, stimulating further changes to the membership and its social dynamics. Linguistic change --- in the sense of innovation that becomes accepted as the norm --- is essential to this dynamic process: it both facilitates individual expression and fosters(More)
Tree-structured neural networks exploit valuable syntactic parse information as they interpret the meanings of sentences. However, they suffer from two key technical problems that make them slow and unwieldy for large-scale NLP tasks: they usually operate on parsed sentences and they do not directly support batched computation. We address these issues by(More)
A discourse typically involves numerous entities , but few are mentioned more than once. Distinguishing discourse entities that die out after just one mention (singletons) from those that lead longer lives (coreferent) would benefit NLP applications such as coreference resolution , protagonist identification, topic mod-eling, and discourse coherence. We(More)
Person-to-person evaluations are prevalent in all kinds of discourse and important for establishing reputations, building social bonds, and shaping public opinion. Such evaluations can be analyzed separately using signed social networks and textual sentiment analysis, but this misses the rich interactions between language and social context. To capture such(More)
Harmonic Grammar (HG) is a model of linguistic constraint interaction in which well-formedness is calculated as the sum of weighted constraint violations. We show how linear programming algorithms can be used to determine whether there is a weighting for a set of constraints that fits a set of linguistic data. The associated software package OT-Help(More)