VQA: Visual Question Answering

  title={VQA: Visual Question Answering},
  author={Stanislaw Antol and Aishwarya Agrawal and Jiasen Lu and Margaret Mitchell and Dhruv Batra and C. Lawrence Zitnick and Devi Parikh},
  journal={2015 IEEE International Conference on Computer Vision (ICCV)},
We propose the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. Mirroring real-world scenarios, such as helping the visually impaired, both the questions and answers are open-ended. Visual questions selectively target different areas of an image, including background details and underlying context. As a result, a system that succeeds at VQA typically needs… CONTINUE READING
This paper has been referenced on Twitter 53 times. VIEW TWEETS

From This Paper

Figures, tables, and topics from this paper.