• Publications
  • Influence
Open Information Extraction from the Web
TLDR
Open IE (OIE), a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input, is introduced. Expand
Identifying Relations for Open Information Extraction
TLDR
Two simple syntactic and lexical constraints on binary relations expressed by verbs are introduced in the ReVerb Open IE system, which more than doubles the area under the precision-recall curve relative to previous extractors such as TextRunner and woepos. Expand
Open Language Learning for Information Extraction
Open Information Extraction (IE) systems extract relational tuples from text, without requiring a pre-specified vocabulary, by identifying relation phrases and associated arguments in arbitraryExpand
Unsupervised named-entity extraction from the Web: An experimental study
TLDR
An overview of KnowItAll's novel architecture and design principles is presented, emphasizing its distinctive ability to extract information without any hand-labeled training examples, and three distinct ways to address this challenge are presented and evaluated. Expand
Learning Information Extraction Rules for Semi-Structured and Free Text
TLDR
WHISK is designed to handle text styles ranging from highly structured to free text, including text that is neither rigidly formatted nor composed of grammatical sentences, and can also handle extraction from free text such as news stories. Expand
Web-scale information extraction in knowitall: (preliminary results)
TLDR
KnowItAll, a system that aims to automate the tedious process of extracting large collections of facts from the web in an autonomous, domain-independent, and scalable manner, is introduced. Expand
Open Information Extraction: The Second Generation
TLDR
The second generation of Open IE systems are described, which rely on a novel model of how relations and their arguments are expressed in English sentences to double precision/recall compared with previous systems such as TEXTRUNNER and WOE. Expand
CRYSTAL: Inducing a Conceptual Dictionary
TLDR
CRYSTAL is described, a system which automatically induces a dictionary of "concept-node definitions" sufficient to identify relevant information from a training corpus that can often surpass human intuitions in creating reliable extraction rules. Expand
TextRunner: Open Information Extraction on the Web
TLDR
The TextRunner system demonstrates a new kind of information extraction, called Open Information Extraction (OIE), in which the system makes a single, data-driven pass over the entire corpus and extracts a large set of relational tuples, without requiring any human input. Expand
Generating Coherent Event Schemas at Scale
TLDR
This work presents a novel approach to inducing open-domain event schemas that overcomes limitations of Chambers and Jurafsky's (2009) schemas and uses cooccurrence statistics of semantically typed relational triples, which it calls Rel-grams (relational n- grams). Expand
...
1
2
3
4
5
...