Corpus ID: 208527679

Learning Predicates as Functions to Enable Few-shot Scene Graph Prediction.

  title={Learning Predicates as Functions to Enable Few-shot Scene Graph Prediction.},
  author={Apoorva Dornadula and Austin Narcomey and Ranjay Krishna and Michael S. Bernstein and Li Fei-Fei},
  journal={arXiv: Computer Vision and Pattern Recognition},
Scene graph prediction --- classifying the set of objects and predicates in a visual scene --- requires substantial training data. However, most predicates only occur a handful of times making them difficult to learn. We introduce the first scene graph prediction model that supports few-shot learning of predicates. Existing scene graph generation models represent objects using pretrained object detectors or word embeddings that capture semantic object information at the cost of encoding… Expand
Learning from the Scene and Borrowing from the Rich: Tackling the Long Tail in Scene Graph Generation
This paper tackles scene-object interaction aiming at learning specific knowledge from a scene via an additive attention mechanism and long-tail knowledge transfer which tries to transfer the rich knowledge learned from the head into the tail. Expand
  • 2020
Scene graph (SG) generation has been gaining a lot of traction recently. Current SG generation techniques, however, rely on the availability of expensive and limited number of labeled datasets.Expand
Sim2SG: Sim-to-Real Scene Graph Generation for Transfer Learning
This work proposes Sim2SG, a scalable technique for sim-to-real transfer for scene graph generation that addresses the domain gap by decomposing it into appearance, label and prediction discrepancies between the two domains. Expand


Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation
A subgraph-based connection graph is proposed to concisely represent the scene graph during the inference to improve the efficiency of scene graph generation and outperforms the state-of-the-art method in both accuracy and speed. Expand
GloVe: Global Vectors for Word Representation
A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure. Expand
SPICE: Semantic Propositional Image Caption Evaluation
There is considerable interest in the task of automatically generating image captions. However, evaluation is challenging. Existing automatic evaluation metrics are primarily sensitive to n-gramExpand
Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples
A semi-supervised framework that incorporates labeled and unlabeled data in a general-purpose learner is proposed and properties of reproducing kernel Hilbert spaces are used to prove new Representer theorems that provide theoretical basis for the algorithms. Expand
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
The Visual Genome dataset is presented, which contains over 108K images where each image has an average of $$35$$35 objects, $$26$$26 attributes, and $$21$$21 pairwise relationships between objects, and represents the densest and largest dataset of image descriptions, objects, attributes, relationships, and question answer pairs. Expand
Referring Relationships
An iterative model is introduced that localizes the two entities in the referring relationship by modelling predicates that connect the entities as shifts in attention from one entity to another, and it is demonstrated that this model can not only outperform existing approaches on three datasets but also that it produces visually meaningful predicate shifts, as an instance of interpretable neural networks. Expand
Scene Graph Generation by Iterative Message Passing
This work explicitly model the objects and their relationships using scene graphs, a visually-grounded graphical structure of an image, and proposes a novel end-to-end model that generates such structured scene representation from an input image. Expand
Scene Graph Generation With External Knowledge and Image Reconstruction
This paper proposes a novel scene graph generation algorithm with external knowledge and image reconstruction loss to overcome dataset issues, and extracts commonsense knowledge from the external knowledge base to refine object and phrase features for improving generalizability inscene graph generation. Expand
Scene Graph Prediction With Limited Labels
This paper introduces a semi-supervised method that assigns probabilistic relationship labels to a large number of unlabeled images using few labeled examples and defines a complexity metric for relationships that serves as an indicator for conditions under which the method succeeds over transfer learning, the de-facto approach for training with limited labels. Expand
Few-Shot Learning with Graph Neural Networks
A graph neural network architecture is defined that generalizes several of the recently proposed few-shot learning models and provides improved numerical performance, and is easily extended to variants of few- shot learning, such as semi-supervised or active learning, demonstrating the ability of graph-based models to operate well on 'relational' tasks. Expand