IKE - An Interactive Tool for Knowledge Extraction

@inproceedings{Dalvi2016IKEA,
  title={IKE - An Interactive Tool for Knowledge Extraction},
  author={Bhavana Dalvi and Sumithra Bhakthavatsalam and Christopher Clark and Peter Clark and Oren Etzioni and Anthony Fader and Dirk Groeneveld},
  booktitle={AKBC@NAACL-HLT},
  year={2016}
}
Recent work on information extraction has suggested that fast, interactive tools can be highly effective; however, creating a usable system is challenging, and few publically available tools exist. [] Key Method To operationalize this, IKE uses a novel query language that is expressive, easy to understand, and fast to execute essential requirements for a practical system.

Figures and Tables from this paper

Scalable Semantic Querying of Text
TLDR
The KOKO system is presented, which is novel in that its extraction language simultaneously supports conditions on the surface of the text and on the structure of the dependency parse tree of sentences, thereby allowing for more refined extractions.
A Lightweight Front-end Tool for Interactive Entity Population
TLDR
A lightweight front-end tool for facilitating interactive entity population, which aims to reduce user cost from beginning to end, including package installation and maintenance, and an entity expansion module is implemented as external APIs.
Discourse in Multimedia: A Case Study in Information Extraction
TLDR
This paper examines how multimedia discourse features in multimedia text can be used to improve an information extraction system and shows that the discourse and text layout features provide information that is complementary to lexical semantic information commonly used for information extraction.
Domain-Targeted, High Precision Knowledge Extraction
TLDR
This work has created a domain-targeted, high precision knowledge extraction pipeline, leveraging Open IE, crowdsourcing, and a novel canonical schema learning algorithm (called CASI), that produces high precisionknowledge targeted to a particular domain - in this case, elementary science.
Active Learning with Adaptive Density Weighted Sampling for Information Extraction from Scientific Papers
TLDR
It is demonstrated that active learning can be a very efficient technique for scientific text mining, and a novel adaptive density-weighted sampling (ADWeS) meta-strategy can be beneficial for corpus annotation with strongly skewed class distribution.
Visual Supervision in Bootstrapped Information Extraction
TLDR
This work has developed an embedding-based bootstrapping model that learns the distributional similarity of entities through the patterns that match them in a large data corpus, while being discriminative with respect to human-labeled and machine-promoted entities.
Discourse in Multimedia: A Case Study in Extracting Geometry Knowledge from Textbooks
TLDR
It is concluded that the discourse and text layout features in multimedia text provide information that is complementary to lexical semantic information and can be used to improve an existing solver for geometry problems, making it more accurate as well as more explainable.
Discourse in Multimedia: A Case Study in Extracting Geometry Knowledge from Textbooks
TLDR
It is concluded that the discourse and text layout features in multimedia text provide information that is complementary to lexical semantic information and can be used to improve an existing solver for geometry problems, making it more accurate as well as more explainable.
A Relation-Centric View of Semantic Representation Learning
TLDR
Semantic representations learned automatically from data have proven to be useful for many downstream applications such as question answering, word-sense discrimination and disambiguation, and selectional preference modeling and the term semantic representation learning is used to encompass all these views and techniques.
...
1
2
3
...

References

SHOWING 1-10 OF 28 REFERENCES
Propminer: A Workflow for Interactive Information Extraction and Exploration using Dependency Trees
TLDR
This work introduces the proposed five step workflow for creating information extractors, the graph query based rule language, as well as the core features of the PROPMINER tool.
Extreme Extraction: Only One Hour per Relation
TLDR
A novel system is presented, InstaRead, that streamlines authoring with an ensemble of methods: encoding extraction rules in an expressive and compositional representation, guiding the user to promising rules based on corpus statistics and mined resources, and introducing a new interactive development cycle that provides immediate feedback --- even on large datasets.
Web-scale information extraction in knowitall: (preliminary results)
TLDR
KnowItAll, a system that aims to automate the tedious process of extracting large collections of facts from the web in an autonomous, domain-independent, and scalable manner, is introduced.
KnowItNow: Fast, Scalable Information Extraction from the Web
TLDR
A novel architecture for IE that obviates queries to commercial search engines is introduced, embodied in a system called KnowItNow that performs high-precision IE in minutes instead of days, and the tradeoff between recall and speed is quantified.
Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations
TLDR
A novel approach for multi-instance learning with overlapping relations that combines a sentence-level extraction model with a simple, corpus-level component for aggregating the individual facts is presented.
Learning Syntactic Patterns for Automatic Hypernym Discovery
TLDR
This paper presents a new algorithm for automatically learning hypernym (is-a) relations from text, using "dependency path" features extracted from parse trees and introduces a general-purpose formalization and generalization of these patterns.
WizIE: A Best Practices Guided Development Environment for Information Extraction
TLDR
WizIE provides an integrated wizard-like environment that guides IE developers step-by-step throughout the entire development process, based on best practices synthesized from the experience of expert developers.
Automatic Acquisition of Hyponyms from Large Text Corpora
TLDR
A set of lexico-syntactic patterns that are easily recognizable, that occur frequently and across text genre boundaries, and that indisputably indicate the lexical relation of interest are identified.
Dependency-Based Open Information Extraction
TLDR
A new multilingual OIE system based on robust and fast rule-based dependency parsing that permits to extract more precise assertions from text than state of the art OIE systems, keeping a crucial property of those systems: scaling to Web-size document collections.
Leveraging Linguistic Structure For Open Domain Information Extraction
TLDR
This work replaces this large pattern set with a few patterns for canonically structured sentences, and shifts the focus to a classifier which learns to extract self-contained clauses from longer sentences to determine the maximally specific arguments for each candidate triple.
...
1
2
3
...