• Corpus ID: 8779651

Modelling Entity Instantiations

  title={Modelling Entity Instantiations},
  author={Andrew James McKinlay and Katja Markert},
The problem of automatically extracting structured information from texts is an important, unsolved problem within the field of Natural Language Processing. The extraction of such information can facilitate activities such as the building of knowledge bases, automatic summarisation and sentiment analysis. A human reader can easily discern the events described in a text, along with the participants and the relationships between them, but using a computer to automatically discover the same… 
Recognising Sets and Their Elements: Tree Kernels for Entity Instantiation Identification
This work presents the first reliably annotated intrasentential entity instantiation corpus, along with an extension to the intersentential annotations in McKinlay and Markert (2011), and applies tree kernels to entity instantiations.
The Instantiation Discourse Relation: A Corpus Analysis of Its Properties and Improved Detection
It is shown that sentences involved in INSTANTIATION are set apart from other sentences by the use of gradable (subjective) adjectives, the occurrence of rare words and by different patterns in part-of-speech usage.
UCCA: A Semantics-based Grammatical Annotation Scheme
A simple semantic annotation scheme, UCCA for Universal Conceptual Cognitive Annotation, that covers many of the most important elements and relations present in linguistic utterances, including verb-argument structure, optional adjuncts such as adverbials, clause embeddings, and the linkage between them is proposed.
Phrase Generalization: a Corpus Study in Multi-Document Abstracts and Original News Alignments
This paper presents a corpus study on a more subtle and understudied phenomenon: noun phrase generalization, arriving at a five category classification scheme and finding that the most common category requires semantic interpretation and inference.
Improving the Annotation of Sentence Specificity
It is found that missing details that are not resolved in the the prior context are more likely to trigger questions about the reason behind events, “why” and “how”.
From Discourse Structure to Text Specificity: Studies of Coherence Preferences
This thesis explores the automatic prediction of text specificity, and whether the perception of specificity varies across different audiences, and proposes a semi-supervised system to predict sentence specificity with speed and accuracy.
Fast and Accurate Prediction of Sentence Specificity
A practical system for predicting sentence specificity which exploits only features that require minimum processing and is trained in a semi-supervised manner, and shows that specificity is a useful indicator for finding sentences that need to be simplified and a useful objective for simplification.
How Do We Answer Complex Questions: Discourse Structure of Long-form Answers
An ontology of six sentence-level functional roles for long-form answers is developed, finding that annotators agree less with each other when annotating model-generated answers compared to annotating human-written answers.
Inquisitive Question Generation for High Level Text Comprehension
This work introduces INQUISITIVE, a dataset of ~19K questions that are elicited while a person is reading through a document, and shows that readers engage in a series of pragmatic strategies to seek information.


Manually vs. Automatically Labelled Data in Discourse Relation Classification: Effects of Example and Feature Selection
How the Web Ontology Language OWL can be used to represent and interrelate the entities and relations in both types of resources is discussed and three OWL models are presented, each of which offers different solutions to this question.
Extracting Semantic Networks from Text Via Relational Clustering
This paper uses the TextRunner system to extract tuples from text, and then induce general concepts and relations from them by jointly clustering the objects and relational strings in the tuples using Markov logic.
Discovering Relations among Named Entities from Large Corpora
Using one year of newspapers reveals not only that the relations among named entities could be detected with high recall and precision, but also that appropriate labels could be automatically provided for the relations.
Constraints Based Taxonomic Relation Classification
A system that, given two terms, determines the taxonomic relation between them using a machine learning-based approach that makes use of existing resources and significantly outperforms other systems built upon existing well-known knowledge sources.
Unsupervised named-entity extraction from the Web: An experimental study
Probabilistic Reasoning for Entity & Relation Recognition
This paper develops a method for recognizing relations and entities in sentences, while taking mutual dependencies among them into account, and preliminary experimental results are promising and show that the global inference approach improves over learning Relations and entities separately.
Structured Relation Discovery using Generative Models
A series of generative probabilistic models are proposed, broadly similar to topic models, each which generates a corpus of observed triples of entity mention pairs and the surface syntactic dependency path between them.
Distant supervision for relation extraction without labeled data
This work investigates an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size.
Information Extraction: Beyond Document Retrieval
In this paper we give a synoptic view of the growth of the text processing technology of information extraction (IE) whose function is to extract information about a pre‐specified set of entities,
Exploiting Background Knowledge for Relation Extraction
This paper proposes methods for using knowledge and resources that are external to the target sentence, as a way to improve relation extraction by exploiting background knowledge such as relationships among the target relations, as well as by considering how target relations relate to some existing knowledge resources.