VisKE: Visual knowledge extraction and question answering by visual verification of relation phrases

@article{Sadeghi2015VisKEVK,
  title={VisKE: Visual knowledge extraction and question answering by visual verification of relation phrases},
  author={Fereshteh Sadeghi and S. Divvala and Ali Farhadi},
  journal={2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2015},
  pages={1456-1464}
}
How can we know whether a statement about our world is valid. For example, given a relationship between a pair of entities e.g., `eat(horse, hay)', how can we know whether this relationship is true or false in general. Gathering such knowledge about entities and their relationships is one of the fundamental challenges in knowledge extraction. Most previous works on knowledge extraction have focused purely on text-driven reasoning for verifying relation phrases. In this work, we introduce the… Expand
Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge from External Sources
TLDR
A method for visual question answering which combines an internal representation of the content of an image with information extracted from a general knowledge base to answer a broad range of image-based questions and is specifically able to answer questions posed in natural language, that refer to information not contained in the image. Expand
Verb Physics: Relative Physical Knowledge of Actions and Objects
TLDR
An approach to infer relative physical knowledge of actions and objects along five dimensions (e.g., size, weight, and strength) from unstructured natural language text is presented. Expand
Visual Relation Extraction via Multi-modal Translation Embedding Based Model
TLDR
This paper proposes a novel visual relation extraction model named Multi-modal Translation Embedding Based Model to integrate the visual information and respective textual knowledge base and proposes a visual phrase learning method to capture the interactions between objects of the image to enhance the performance ofVisual relation extraction. Expand
Visual entity linking
TLDR
This paper proposes a novel approach for linking image regions to entities in Dbpedia and Freebase using an automatic image description generation algorithm and presents an extensive analysis to identify the sources of errors in the system. Expand
KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA
TLDR
This work study open-domain knowledge, the setting when the knowledge required to answer a question is not given/annotated, neither at training nor test time, and demonstrates that while the model successfully exploits implicit knowledge reasoning, the symbolic answer module which explicitly connects the knowledge graph to the answer vocabulary is critical to the performance of the method. Expand
Phrase Localization and Visual Relationship Detection with Comprehensive Linguistic Cues
This paper presents a framework for localization or grounding of phrases in images using a large collection of linguistic and visual cues.1 We model the appearance, size, and position of entityExpand
Detecting Unseen Visual Relations Using Analogies
TLDR
This work learns a representation of visual relations that combines individual embeddings for subject, object and predicate together with a visual phrase embedding that represents the relation triplet, and demonstrates the benefits of this approach on three challenging datasets. Expand
Visual Translation Embedding Network for Visual Relation Detection
TLDR
This work proposes a novel feature extraction layer that enables object-relation knowledge transfer in a fully-convolutional fashion that supports training and inference in a single forward/backward pass, and proposes the first end-toend relation detection network. Expand
Weakly-Supervised Learning of Visual Relations
TLDR
A novel approach for modeling visual relations between pairs of objects where the predicate is typically a preposition or a verb that links a pair of objects, and proposes a weakly-supervised discriminative clustering model to learn relations from image-level labels only. Expand
Embedding Network for Visual Relation Detection
Visual relations, such as “person ride bike” and “bike next to car”, offer a comprehensive scene understanding of an image, and have already shown their great utility in connecting computer visionExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 42 REFERENCES
Acquiring temporal constraints between relations
TLDR
The proposed algorithm, GraphOrder, is a novel and scalable graph-based label propagation algorithm that takes transitivity of temporal order into account, as well as these statistics on narrative order of verb mentions, and achieves as high as 38.4% absolute improvement in F1 over a random baseline. Expand
Open question answering over curated and extracted knowledge bases
TLDR
This paper presents OQA, the first approach to leverage both curated and extracted KBs, and demonstrates that it achieves up to twice the precision and recall of a state-of-the-art Open QA system. Expand
Adopting Abstract Images for Semantic Scene Understanding
TLDR
This paper proposes studying semantic information in abstract images created from collections of clip art, and creates 1,002 sets of 10 semantically similar abstract images with corresponding written descriptions to discover semantically important features, the relations of words to visual features and methods for measuring semantic similarity. Expand
Semantic Parsing on Freebase from Question-Answer Pairs
TLDR
This paper trains a semantic parser that scales up to Freebase and outperforms their state-of-the-art parser on the dataset of Cai and Yates (2013), despite not having annotated logical forms. Expand
Learning Everything about Anything: Webly-Supervised Visual Concept Learning
TLDR
A fully-automated approach for learning extensive models for a wide range of variations within any concept, which leverages vast resources of online books to discover the vocabulary of variance, and intertwines the data collection and modeling steps to alleviate the need for explicit human supervision in training the models. Expand
Learning Knowledge Graphs for Question Answering through Conversational Dialog
TLDR
This work is the first to acquire knowledge for question-answering from open, natural language dialogs without a fixed ontology or domain model that predetermines what users can say. Expand
Identifying Relations for Open Information Extraction
TLDR
Two simple syntactic and lexical constraints on binary relations expressed by verbs are introduced in the ReVerb Open IE system, which more than doubles the area under the precision-recall curve relative to previous extractors such as TextRunner and woepos. Expand
Recognition using visual phrases
TLDR
It is shown that a visual phrase detector significantly outperforms a baseline which detects component objects and reasons about relations, even though visual phrase training sets tend to be smaller than those for objects. Expand
NEIL: Extracting Visual Knowledge from Web Data
TLDR
NEIL (Never Ending Image Learner), a computer program that runs 24 hours per day and 7 days per week to automatically extract visual knowledge from Internet data, is proposed in an attempt to develop the world's largest visual structured knowledge base with minimum human labeling effort. Expand
A study of the knowledge base requirements for passing an elementary science test
TLDR
The analysis suggests that as well as fact extraction from text and statistically driven rule extraction, three other styles of automatic knowledge base construction (AKBC) would be useful: acquiring definitional knowledge, direct 'reading' of rules from texts that state them, and, given a particular representational framework, acquisition of specific instances of those models from text. Expand
...
1
2
3
4
5
...