A Survey of Current Datasets for Vision and Language Research

@inproceedings{Ferraro2015ASO,
  title={A Survey of Current Datasets for Vision and Language Research},
  author={Francis Ferraro and Nasrin Mostafazadeh and Ting-Hao Huang and Lucy Vanderwende and Jacob Devlin and Michel Galley and Margaret Mitchell},
  booktitle={EMNLP},
  year={2015}
}
Integrating vision and language has long been a dream in work on artificial intelligence (AI). In the past two years, we have witnessed an explosion of work that brings together vision and language from images to videos and beyond. The available corpora have played a crucial role in advancing this area of research. In this paper, we propose a set of quality metrics for evaluating and analyzing the vision & language datasets and categorize them accordingly. Our analyses show that the most recent… CONTINUE READING

Figures, Tables, and Topics from this paper.

Explore Further: Topics Discussed in This Paper

Citations

Publications citing this paper.
SHOWING 1-10 OF 30 CITATIONS

Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos

  • 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
VIEW 1 EXCERPT
CITES BACKGROUND

References

Publications referenced by this paper.
SHOWING 1-10 OF 39 REFERENCES

Microsoft COCO: Common Objects in Context

VIEW 10 EXCERPTS
HIGHLY INFLUENTIAL

Question answering about images using visual semantic embeddings

Mengye Ren, Ryan Kiros, Richard Zemel.
  • Deep Learning Workshop, ICML 2015.
  • 2015
VIEW 5 EXCERPTS
HIGHLY INFLUENTIAL

Long-Term Recurrent Convolutional Networks for Visual Recognition and Description

Jeff Donahue, Lisa Anne Hendricks, +4 authors Trevor Darrell
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2014
VIEW 4 EXCERPTS
HIGHLY INFLUENTIAL

Learning the Visual Interpretation of Sentences

  • 2013 IEEE International Conference on Computer Vision
  • 2013
VIEW 3 EXCERPTS

VQA: Visual Question Answering

  • International Journal of Computer Vision
  • 2015
VIEW 3 EXCERPTS

From captions to visual concepts and back

  • 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2014
VIEW 4 EXCERPTS