Ontology-Aware Token Embeddings for Prepositional Phrase Attachment

@article{Dasigi2017OntologyAwareTE,
  title={Ontology-Aware Token Embeddings for Prepositional Phrase Attachment},
  author={Pradeep Dasigi and Waleed Ammar and Chris Dyer and Eduard H. Hovy},
  journal={ArXiv},
  year={2017},
  volume={abs/1705.02925}
}
Type-level word embeddings use the same set of parameters to represent all instances of a word regardless of its context, ignoring the inherent lexical ambiguity in language. Instead, we embed semantic concepts (or synsets) as defined in WordNet and represent a word token in a particular context by estimating a distribution over relevant semantic concepts. We use the new, context-sensitive embeddings in a model for predicting prepositional phrase (PP) attachments and jointly learn the concept… 

Figures and Tables from this paper

Implicit Discourse Relation Classification with Syntax-Aware Contextualized Word Representations

TLDR
A method to compute contextualized representations of words, leveraging information from the sentence dependency parse, to improve argument representation and shows that the proposed representations achieve state-of-the-art results when input to standard neural network architectures.

Towards syntax-aware token embeddings

TLDR
This paper investigates Syntax-Aware word Token Embeddings (SATokE) as a way to explicitly encode specific information derived from the linguistic analysis of a sentence in vectors which are input to a deep learning model and proposes an efficient unsupervised learning algorithm based on tensor factorisation for computing these token embeddings given an arbitrary graph of linguistic structure.

From lexical towards contextualized meaning representation

TLDR
This thesis proposes syntax-aware token embeddings (SATokE) that capture specific linguistic information, encoding the structure of the sentence from a dependency point of view in their representations and empirically demonstrates the superiority of the token representations compared to popular distributional representations of words and to other token embedDings proposed in the literature.

Embedding Syntax and Semantics of Prepositions via Tensor Decomposition

TLDR
This paper uses word-triple counts to capture a preposition’s interaction with its attachment and complement, and derives preposition embeddings via tensor decomposition on a large unlabeled corpus.

Using Multi-Sense Vector Embeddings for Reverse Dictionaries

TLDR
This work studies the effect of multi-sense embeddings on the task of reverse dictionaries and proposes a technique to easily integrate them into an existing neural network architecture using an attention mechanism.

PoKED: A Semi-Supervised System for Word Sense Disambiguation

TLDR
The proposed PoKED system incorporates position-wise encoding into an orthogonal framework and applies a knowledge-based attentive neural model to solve the WSD problem.

Russian Prepositional Phrase Semantic Labelling with Word Embedding-based Classifier

TLDR
The research shows that although semantic differences between some prepositional semantic classes are quite vague, it is possible to achieve promising classification results for core classes.

Visual Disambiguation of Prepositional Phrase Attachments: Multimodal Machine Learning for Syntactic Analysis Correction

TLDR
This work proposes a correction strategy pipeline for prepositional attachments using visual information, trained on a multimodal corpus of images and captions, and shows that using visual features allows, in certain cases, to correct the errors of a parser.

Predicting the Argumenthood of English Prepositional Phrases

TLDR
The utility of argumenthood prediction in improving sentence representations via performance gains on SRL when a sentence encoder is pretrained with the authors' tasks is demonstrated.

Predicting Argumenthood of English Preposition Phrases

TLDR
The utility of argumenthood prediction in improving sentence representations via performance gains on SRL when a sentence encoder is pretrained with the authors' tasks is demonstrated.

References

SHOWING 1-10 OF 27 REFERENCES

Ontologically Grounded Multi-sense Representation Learning for Semantic Vector Space Models

TLDR
This paper proposes two novel and general approaches for generating sense-specific word embeddings that are grounded in an ontology, and applies graph smoothing as a postprocessing step to tease the vectors of different senses apart, and is applicable to any vector space model.

AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes

TLDR
This work presents AutoExtend, a system to learn embeddings for synsets and lexemes that achieves state-of-the-art performance on word similarity and word sense disambiguation tasks.

Exploring Compositional Architectures and Word Vector Representations for Prepositional Phrase Attachment

TLDR
This paper shows that word vector representations can yield significant PP attachment performance gains via a non-linear architecture that is discriminatively trained to maximize PP attachment accuracy.

Improving Word Representations via Global Context and Multiple Word Prototypes

TLDR
A new neural network architecture is presented which learns word embeddings that better capture the semantics of words by incorporating both local and global document context, and accounts for homonymy and polysemy by learning multiple embedDings per word.

Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space

TLDR
An extension to the Skip-gram model that efficiently learns multiple embeddings per word type is presented, and its scalability is demonstrated by training with one machine on a corpus of nearly 1 billion tokens in less than 6 hours.

Improving Lexical Embeddings with Semantic Knowledge

TLDR
This work proposes a new learning objective that incorporates both a neural language model objective (Mikolov et al., 2013) and prior knowledge from semantic resources to learn improved lexical semantic embeddings.

Linear Algebraic Structure of Word Senses, with Applications to Polysemy

TLDR
It is shown that multiple word senses reside in linear superposition within the word embedding and simple sparse coding can recover vectors that approximately capture the senses.

GloVe: Global Vectors for Word Representation

TLDR
A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

Word Representations via Gaussian Embedding

TLDR
This paper advocates for density-based distributed embeddings and presents a method for learning representations in the space of Gaussian distributions, and investigates the ability of these embedDings to model entailment and other asymmetric relationships, and explores novel properties of the representation.

Improving Parsing and PP Attachment Performance with Sense Information

TLDR
A gold-standard sense- and parse tree-annotated dataset based on the intersection of the Penn Treebank and SemCor is devised, and semantic classes help to obtain significant improvement in both parsing and PP attachment tasks.