Simple Baseline for Visual Question Answering

@article{Zhou2015SimpleBF,
  title={Simple Baseline for Visual Question Answering},
  author={Bolei Zhou and Yuandong Tian and Sainbayar Sukhbaatar and Arthur Szlam and Rob Fergus},
  journal={CoRR},
  year={2015},
  volume={abs/1512.02167}
}
We describe a very simple bag-of-words baseline for visual question answering. This baseline concatenates the word features from the question and CNN features from the image to predict the answer. When evaluated on the challenging VQA dataset [2], it shows comparable performance to many recent approaches using recurrent neural networks. To explore the strength and weakness of the trained model, we also provide an interactive web demo1, and open-source code2. 
Highly Influential
This paper has highly influenced 13 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 161 citations. REVIEW CITATIONS
Recent Discussions
This paper has been referenced on Twitter 43 times over the past 90 days. VIEW TWEETS

Citations

Publications citing this paper.
Showing 1-10 of 109 extracted citations

161 Citations

050201620172018
Citations per Year
Semantic Scholar estimates that this publication has 161 citations based on the available data.

See our FAQ for additional information.

References

Publications referenced by this paper.
Showing 1-10 of 19 references

Similar Papers

Loading similar papers…