Graph R-CNN for Scene Graph Generation
@article{Yang2018GraphRF, title={Graph R-CNN for Scene Graph Generation}, author={Jianwei Yang and Jiasen Lu and Stefan Lee and Dhruv Batra and Devi Parikh}, journal={ArXiv}, year={2018}, volume={abs/1808.00191} }
We propose a novel scene graph generation model called Graph R-CNN, that is both effective and efficient at detecting objects and their relations in images. [] Key Method We also propose an attentional Graph Convolutional Network (aGCN) that effectively captures contextual information between objects and relations. Finally, we introduce a new evaluation metric that is more holistic and realistic than existing metrics. We report state-of-the-art performance on scene graph generation as evaluated using both…
571 Citations
Fully Convolutional Scene Graph Generation
- Computer Science2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2021
A fully convolutional scene graph generation (FCSGG) model that detects objects and relations simultaneously and achieves highly competitive results on recall and zeroshot recall with significantly reduced inference time is presented.
Relation Regularized Scene Graph Generation
- Computer ScienceIEEE Transactions on Cybernetics
- 2022
A relation regularized network (R2-Net) is proposed, which can predict whether there is a relationship between two objects and encode this relation into object feature refinement and better SGG.
Scene Graph Generation With External Knowledge and Image Reconstruction
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
This paper proposes a novel scene graph generation algorithm with external knowledge and image reconstruction loss to overcome dataset issues, and extracts commonsense knowledge from the external knowledge base to refine object and phrase features for improving generalizability inscene graph generation.
Scene Graph Generation Using Depth, Spatial, and Visual Cues in 2D Images
- Computer ScienceIEEE Access
- 2022
A framework (S2G) is proposed for generating scene graphs directly from images using depth and spatial information of object pairs and evaluated on the scene graph generation model reveal that the proposed framework achieves better results on data than the state-of-the-art.
DH-GCN: Saliency-Aware Complex Scene Graph Generation Using Dual-Hierarchy Graph Convolutional Network
- Computer Science2022 IEEE International Conference on Multimedia and Expo (ICME)
- 2022
An innovative dual-hierarchy graph convolutional network (DH-GCN) is proposed, which is a conceptually elegant and efficient top-down approach to graph generation that leverages salient object detector to hierarchize objects and give gist nodes more accurate representation.
Transformer-based Scene Graph Generation Network With Relational Attention Module
- Computer Science2022 26th International Conference on Pattern Recognition (ICPR)
- 2022
A novel transformer-based network and a training scheme with instance-level pseudotargets are proposed and the relational attention module is introduced to overcome the cropped feature problem and achieves state-of-the-art or competitive performance in all tasks.
Memory-Based Network for Scene Graph with Unbalanced Relations
- Computer ScienceACM Multimedia
- 2020
This work proposes a novel scene graph generation model that can effectively improve the detection of low-frequency relations and uses the method of memory features to realize the transfer of high-frequency relation features to low- frequencies.
Attentive Gated Graph Neural Network for Image Scene Graph Generation
- Computer ScienceSymmetry
- 2020
This work translates the scene graph into an Attentive Gated Graph Neural Network which can propagate a message by visual relationship embedding and can increase the accuracy of object classification and reduce the complexity of relationship classification.
Attentive Relational Networks for Mapping Images to Scene Graphs
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
A novel Attentive Relational Network that consists of two key modules with an object detection backbone to approach this problem, and accurate scene graphs are produced by the relation inference module to recognize all entities and corresponding relations.
Image Scene Graph Generation (SGG) Benchmark
- Computer ScienceArXiv
- 2021
A much-needed scene graph generation benchmark based on the maskrcnn-benchmark and several popular models and a comprehensive ablation study of scenegraph generation models using the Visual Genome and OpenImages Visual relationship detection datasets are presented.
References
SHOWING 1-10 OF 47 REFERENCES
Pixels to Graphs by Associative Embedding
- Computer ScienceNIPS
- 2017
A method for training a convolutional neural network such that it takes in an input image and produces a full graph definition and is done end-to-end in a single stage with the use of associative embeddings.
Scene Graph Generation from Objects, Phrases and Region Captions
- Computer Science2017 IEEE International Conference on Computer Vision (ICCV)
- 2017
This work proposes a novel neural network model, termed as Multi-level Scene Description Network (denoted as MSDN), to solve the three vision tasks jointly in an end-to-end manner and shows the joint learning across three tasks with the proposed method can bring mutual improvements over previous models.
Scene Graph Generation by Iterative Message Passing
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
This work explicitly model the objects and their relationships using scene graphs, a visually-grounded graphical structure of an image, and proposes a novel end-to-end model that generates such structured scene representation from an input image.
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2015
This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.
Neural Motifs: Scene Graph Parsing with Global Context
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
This work analyzes the role of motifs: regularly appearing substructures in scene graphs and introduces Stacked Motif Networks, a new architecture designed to capture higher order motifs in scene graph graphs that improves on the previous state-of-the-art by an average of 3.6% relative improvement across evaluation settings.
Relationship Proposal Networks
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
The model is named the Relationship Proposal Network (Rel-PN), which is class-agnostic and thus scalable to an open vocabulary of objects and demonstrates the ability of the model to localize relationships with only a few thousand proposals.
Graph-Structured Representations for Visual Question Answering
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
This paper proposes to build graphs over the scene objects and over the question words, and describes a deep neural network that exploits the structure in these representations, and achieves significant improvements over the state-of-the-art.
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
- Computer Science2014 IEEE Conference on Computer Vision and Pattern Recognition
- 2014
This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.
Image retrieval using scene graphs
- Computer Science2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2015
A conditional random field model that reasons about possible groundings of scene graphs to test images and shows that the full model can be used to improve object localization compared to baseline methods and outperforms retrieval methods that use only objects or low-level image features.
ViP-CNN: A Visual Phrase Reasoning Convolutional Neural Network for Visual Relationship Detection
- Computer ScienceArXiv
- 2017
In ViP-CNN, the visual relationship is considered as a phrase with three components and a Visual Phrase Reasoning Structure (VPRS) is presented to set up the connection among the relationship components and help the model consider the three problems jointly.