Videos as Space-Time Region Graphs

@article{Wang2018VideosAS,
  title={Videos as Space-Time Region Graphs},
  author={X. Wang and A. Gupta},
  journal={ArXiv},
  year={2018},
  volume={abs/1806.01810}
}
  • X. Wang, A. Gupta
  • Published 2018
  • Computer Science
  • ArXiv
  • How do humans recognize the action “opening a book. [...] Key Method These nodes are connected by two types of relations: (i) similarity relations capturing the long range dependencies between correlated objects and (ii) spatial-temporal relations capturing the interactions between nearby objects. We perform reasoning on this graph representation via Graph Convolutional Networks. We achieve state-of-the-art results on the Charades and Something-Something datasets. Especially for Charades with complex…Expand Abstract
    284 Citations

    Figures, Tables, and Topics from this paper

    Explore Further: Topics Discussed in This Paper

    STAGE: Spatio-Temporal Attention on Graph Entities for Video Action Detection
    • 2
    • PDF
    Video Relation Detection with Spatio-Temporal Graph
    • 12
    Graph Convolutional Networks for Temporal Action Localization
    • 87
    • PDF
    Representation Learning on Visual-Symbolic Graphs for Video Understanding
    • 2
    • PDF
    Recurrent Space-time Graphs for Video Understanding
    • 1
    Spatio-Temporal Action Graph Networks
    • 19
    • Highly Influenced
    • PDF
    Dynamic Regions Graph Neural Networks for Spatio-Temporal Reasoning
    • Highly Influenced
    Understanding Dynamic Scenes using Graph Convolution Networks
    • 2
    • PDF
    Hierarchical Visual-Textual Graph for Temporal Activity Localization via Language
    • 3
    • PDF
    Supervoxel Attention Graphs for Long-Range Video Modeling
    • Highly Influenced
    • PDF

    References

    SHOWING 1-10 OF 95 REFERENCES
    Learning spatiotemporal graphs of human activities
    • 234
    • PDF
    Temporal Dynamic Graph LSTM for Action-Driven Video Object Detection
    • 55
    • PDF
    Temporal Relational Reasoning in Videos
    • 357
    • PDF
    Attend and Interact: Higher-Order Object Interactions for Video Understanding
    • 80
    • PDF
    Asynchronous Temporal Fields for Action Recognition
    • 116
    • PDF
    Structural-RNN: Deep Learning on Spatio-Temporal Graphs
    • 533
    • PDF
    Spatiotemporal Residual Networks for Video Action Recognition
    • 465
    • Highly Influential
    • PDF
    Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images
    • 100
    • PDF
    Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification
    • 305
    • PDF
    Spatiotemporal Multiplier Networks for Video Action Recognition
    • 322
    • PDF