IQA: Visual Question Answering in Interactive Environments

@article{Gordon2017IQAVQ,
  title={IQA: Visual Question Answering in Interactive Environments},
  author={Daniel Gordon and Aniruddha Kembhavi and Mohammad Rastegari and Joseph Redmon and Dieter Fox and Ali Farhadi},
  journal={2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2017},
  pages={4089-4098}
}
We introduce Interactive Question Answering (IQA), the task of answering questions that require an autonomous agent to interact with a dynamic visual environment. IQA presents the agent with a scene and a question, like: "Are there any apples in the fridge?" The agent must navigate around the scene, acquire visual understanding of scene elements, interact with objects (e.g. open refrigerators) and plan for a series of actions conditioned on the question. Popular reinforcement learning… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-10 OF 61 CITATIONS

Embodied Visual Recognition

  • ArXiv
  • 2019
VIEW 2 EXCERPTS
CITES BACKGROUND

Grounding semantics in robots for Visual Question Answering

Björn Wahle
  • 2019
VIEW 1 EXCERPT
CITES BACKGROUND

FILTER CITATIONS BY YEAR

2017
2019

CITATION STATISTICS

  • 4 Highly Influenced Citations

  • Averaged 20 Citations per year from 2017 through 2019

  • 15% Increase in citations per year in 2019 over 2018

References

Publications referenced by this paper.
SHOWING 1-10 OF 73 REFERENCES

Visual Semantic Planning Using Deep Successor Representations

  • 2017 IEEE International Conference on Computer Vision (ICCV)
  • 2017
VIEW 3 EXCERPTS

Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension

  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2017
VIEW 2 EXCERPTS

Cognitive Mapping and Planning for Visual Navigation

  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2017
VIEW 1 EXCERPT

Inferring and Executing Programs for Visual Reasoning

Joel T. Johnson, Bharath Hariharan, +4 authors Ross Girshick
  • 2017 IEEE International Conference on Computer Vision (ICCV)
  • 2017
VIEW 2 EXCERPTS