Deeply Supervised Multimodal Attentional Translation Embeddings for Visual Relationship Detection

@article{Gkanatsios2019DeeplySM,
  title={Deeply Supervised Multimodal Attentional Translation Embeddings for Visual Relationship Detection},
  author={Nikolaos Gkanatsios and Vassilis Pitsikalis and Petros Koutras and Athanasia Zlatintsi and Petros Maragos},
  journal={2019 IEEE International Conference on Image Processing (ICIP)},
  year={2019},
  pages={1840-1844}
}
  • Nikolaos Gkanatsios, Vassilis Pitsikalis, +2 authors Petros Maragos
  • Published in
    IEEE International Conference…
    2019
  • Computer Science
  • Detecting visual relationships, i.e. triplets, has been a challenging Scene Understanding task approached in the past via linguistic priors or spatial information in a single feature branch. We introduce a new deeply supervised two-branch architecture, the Multimodal Attentional Translation Embeddings, where the visual features of each branch are driven by a multimodal attentional mechanism that exploits spatio-linguistic similarities in a low-dimensional space. We present a variety of… CONTINUE READING

    Figures, Tables, and Topics from this paper.

    Explore Further: Topics Discussed in This Paper

    Citations

    Publications citing this paper.

    Attention-Translation-Relation Network for Scalable Scene Graph Generation

    VIEW 1 EXCERPT
    CITES BACKGROUND

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 36 REFERENCES

    Detecting Visual Relationships with Deep Relational Networks

    VIEW 7 EXCERPTS
    HIGHLY INFLUENTIAL

    Towards Context-Aware Interaction Recognition for Visual Relationship Detection

    VIEW 9 EXCERPTS
    HIGHLY INFLUENTIAL

    Visual Relationship Detection with Internal and External Linguistic Knowledge Distillation

    VIEW 5 EXCERPTS
    HIGHLY INFLUENTIAL

    Detecting Visual Relationships Using Box Attention