• Corpus ID: 5975823

Commonsense Knowledge Enhanced Embeddings for Solving Pronoun Disambiguation Problems in Winograd Schema Challenge

  title={Commonsense Knowledge Enhanced Embeddings for Solving Pronoun Disambiguation Problems in Winograd Schema Challenge},
  author={QUAN LIU and Hui Jiang and Zhenhua Ling and Xiao-Dan Zhu and Si Wei and Yu Hu},
  journal={arXiv: Artificial Intelligence},
In this paper, we propose commonsense knowledge enhanced embeddings (KEE) for solving the Pronoun Disambiguation Problems (PDP). The PDP task we investigate in this paper is a complex coreference resolution task which requires the utilization of commonsense knowledge. This task is a standard first round test set in the 2016 Winograd Schema Challenge. In this task, traditional linguistic features that are useful for coreference resolution, e.g. context and gender information, are no longer… 

Figures and Tables from this paper

A Brief Survey and Comparative Study of Recent Development of Pronoun Coreference Resolution in English

This survey focuses on recent progress on hard pronoun coreference resolution problems and conducts extensive experiments to show that even though current models are achieving good performance on the standard evaluation set, they are still not ready to be used in real applications.

A Distributed Solution for Winograd Schema Challenge

A new distributed representation based approach to capture commonsense knowledge that can achieve the best performance on all of the verb similarity datasets and achieve superior performance on a subset of Winograd Schema challenge dataset, compared with other existing pronoun co-reference and embedding models.

Enhancing Language Models with Plug-and-Play Large-Scale Commonsense

A plug-and-play method for largescale commonsense integration without further pre-training is proposed, inspired by the observation that when finetuning LMs for downstream tasks without external knowledge, the variation in the parameter space was minor.

Enhancing Natural Language Representation with Large-Scale Out-of-Domain Commonsense

OK-Transformer effectively integrates commonsense descriptions and enhances them to the target text representation and has verified the effectiveness of OK-Trans transformer in multiple applications such as commonsense reasoning, general text classification, and low-resource commonsense settings.

WINOGRANDE: An Adversarial Winograd Schema Challenge at Scale

This work introduces WinoGrande, a large-scale dataset of 44k problems, inspired by the original WSC design, but adjusted to improve both the scale and the hardness of the dataset, and establishes new state-of-the-art results on five related benchmarks.

An Adversarial Winograd Schema Challenge at Scale

While the original WSC dataset provided only 273 instances, WINOGRANDE includes 43,985 instances, half of which are determined as adversarial, including a novel adversarial filtering algorithm AFLITE for systematic bias reduction, combined with a careful crowdsourcing design.

World Knowledge Representation

This chapter introduces the concept of the knowledge graph, the motivations and an overview of the existing approaches for knowledge graph representation, and discusses several advanced approaches that aim to deal with the current challenges of knowledge graphs representation.

Representation Learning for Natural Language Processing

This chapter presents a brief introduction to representation learning, including its motivation and basic idea, and also reviews its history and recent advances in both machine learning and NLP.

Towards Natural Language Interfaces for Data Visualization: A Survey

This article conducts a comprehensive review of the existing V-NLIs and develops categorical dimensions based on a classic information visualization pipeline with the extension of a V-NLI layer.

Common Sense Reasoning in Autonomous Artificial Intelligent Agents Through Mobile Computing

The outcome of this thesis includes the development of a mobile software application comprising an autonomous artificial intelligent agent that demonstrates an understanding of context, observations, utilization of past experiences, self-improvement through reinforcement learning, and hypothesis development to reach solutions.



Learning Semantic Word Embeddings based on Ordinal Knowledge Constraints

Under this framework, semantic knowledge is represented as many ordinal ranking inequalities and the learning of semantic word embeddings (SWE) is formulated as a constrained optimization problem, where the data-derived objective function is optimized subject to all ordinal knowledge inequality constraints extracted from available knowledge resources.

Knowledge-Powered Deep Learning for Word Embedding

This study explores the capacity of leveraging morphological, syntactic, and semantic knowledge to achieve high-quality word embeddings, and explores these types of knowledge to define new basis for word representation, provide additional input information, and serve as auxiliary supervision in deep learning.

Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

This work argues for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering, and classify these tasks into skill sets so that researchers can identify (and then rectify) the failings of their systems.

RC-NET: A General Framework for Incorporating Knowledge into Word Representations

This paper builds the relational knowledge and the categorical knowledge into two separate regularization functions, and combines both of them with the original objective function of the skip-gram model to obtain word representations enhanced by the knowledge graph.

ConceptNet — A Practical Commonsense Reasoning Tool-Kit

ConceptNet is a freely available commonsense knowledge base and natural-language-processing tool-kit which supports many practical textual-reasoning tasks over real-world documents including

The Distributional Inclusion Hypotheses and Lexical Entailment

This paper suggests refinements for the Distributional Similarity Hypothesis by developing an inclusion testing algorithm for characteristic features of two words, which incorporates corpus and web-based feature sampling to overcome data sparseness.

Building a Semantic Parser Overnight

A new methodology is introduced that uses a simple grammar to generate logical forms paired with canonical utterances that are meant to cover the desired set of compositional operators and uses crowdsourcing to paraphrase these canonical utterance into natural utterances.

Improving Lexical Embeddings with Semantic Knowledge

This work proposes a new learning objective that incorporates both a neural language model objective (Mikolov et al., 2013) and prior knowledge from semantic resources to learn improved lexical semantic embeddings.

Probabilistic Reasoning via Deep Learning: Neural Association Models

Experimental results on several popular datasets derived from WordNet, FreeBase and ConceptNet have demonstrated that both DNNs and RMNNs perform equally well and they can significantly outperform the conventional methods available for these reasoning tasks.

The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations

There is a sweet-spot, not too big and not too small, between single words and full sentences that allows the most meaningful information in a text to be effectively retained and recalled, and models which store explicit representations of long-term contexts outperform state-of-the-art neural language models at predicting semantic content words.