• Publications
  • Influence
Convolutional 2D Knowledge Graph Embeddings
TLDR
ConvE, a multi-layer convolutional network model for link prediction, is introduced and it is found that ConvE achieves state-of-the-art Mean Reciprocal Rank across most datasets. Expand
brat: a Web-based Tool for NLP-Assisted Text Annotation
TLDR
The brat rapid annotation tool (BRAT) is introduced, an intuitive web-based tool for text annotation supported by Natural Language Processing (NLP) technology and an evaluation of annotation assisted by semantic class disambiguation on a multicategory entity mention annotation task, showing a 15% decrease in total annotation time. Expand
Constructing Datasets for Multi-hop Reading Comprehension Across Documents
TLDR
A novel task to encourage the development of models for text understanding across multiple documents and to investigate the limits of existing methods, in which a model learns to seek and combine evidence — effectively performing multihop, alias multi-step, inference. Expand
Neural Architectures for Fine-grained Entity Type Classification
TLDR
This work investigates several neural network architectures for fine-grained entity type classification and establishes that the attention mechanism learns to attend over syntactic heads and the phrase containing the mention, both of which are known to be strong hand-crafted features for this task. Expand
An Attentive Neural Architecture for Fine-grained Entity Type Classification
TLDR
This work proposes a novel attention-based neural network model that unlike previously proposed models recursively composes representations of entity mention contexts that can learn contextual linguistic expressions that indicate the fine-grained category memberships of an entity. Expand
UCL Machine Reading Group: Four Factor Framework For Fact Finding (HexaF)
TLDR
This system is a four stage model consisting of document retrieval, sentence retrieval, natural language inference and aggregation that achieved a FEVER score of 62.52% on the provisional test set (without additional human evaluation), and 65.41%" on the development set. Expand
Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets
TLDR
A detailed study of the test sets of three popular open-domain benchmark datasets finds that 30% of test-set questions have a near-duplicate paraphrase in their corresponding train sets, and that simple nearest-neighbor models outperform a BART closed-book QA model. Expand
Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection
TLDR
This work investigates cheaply constructing synthetic samples, given a small corpus of human-annotated data, using an off-the-rack attentive sequence-to-sequence model and a straight-forward post-processing procedure, and yields error-filled artificial data that helps a vanilla bi-directional LSTM to outperform the previous state of the art at grammatical error detection. Expand
Jointly Learning Word Representations and Composition Functions Using Predicate-Argument Structures
TLDR
A novel compositional language model that works on PredicateArgument Structures (PASs) that enables this model to capture longrange dependencies between words and to better handle constructs such as verbobject and subject-verb-object relations. Expand
R4C: A Benchmark for Evaluating RC Systems to Get the Right Answer for the Right Reason
TLDR
This work creates and publicly releases the R4C dataset, the first, quality-assured dataset consisting of 4.6k questions, each of which is annotated with 3 reference derivations, and presents a reliable, crowdsourced framework for scalably annotating RC datasets with derivations. Expand
...
1
2
3
4
5
...