IQA: Visual Question Answering in Interactive Environments

  title={IQA: Visual Question Answering in Interactive Environments},
  author={Daniel Gordon and Aniruddha Kembhavi and M. Rastegari and Joseph Redmon and D. Fox and Ali Farhadi},
  journal={2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  • Daniel Gordon, Aniruddha Kembhavi, +3 authors Ali Farhadi
  • Published 2018
  • Computer Science
  • 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • We introduce Interactive Question Answering (IQA), the task of answering questions that require an autonomous agent to interact with a dynamic visual environment. [...] Key Method We propose the Hierarchical Interactive Memory Network (HIMN), consisting of a factorized set of controllers, allowing the system to operate at multiple levels of temporal abstraction.Expand Abstract
    Embodied Question Answering
    • 222
    • Open Access
    Cognitive Mapping and Planning for Visual Navigation
    • 312
    • Open Access
    Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments
    • 267
    • Open Access
    AI2-THOR: An Interactive 3D Environment for Visual AI
    • 188
    • Open Access
    Embodied Question Answering
    • 10
    • Open Access
    Visual Representations for Semantic Target Driven Navigation
    • 55
    • Open Access
    YOLOv3: An Incremental Improvement
    • 3,194
    • Open Access
    Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering
    • 54
    • Open Access


    Publications referenced by this paper.
    VQA: Visual Question Answering
    • 1,833
    • Open Access
    Visual7W: Grounded Question Answering in Images
    • 417
    • Open Access
    Target-driven visual navigation in indoor scenes using deep reinforcement learning
    • 652
    • Open Access
    Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
    • 1,433
    • Open Access
    Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images
    • 446
    • Open Access
    Hierarchical Question-Image Co-Attention for Visual Question Answering
    • 732
    • Open Access
    TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
    • 123
    • Open Access
    Dynamic Memory Networks for Visual and Textual Question Answering
    • 527
    • Open Access
    Neural Module Networks
    • 488
    • Open Access