Visual Dialog

@article{Das2017VisualD,
  title={Visual Dialog},
  author={A. Das and S. Kottur and K. Gupta and Avi Singh and Deshraj Yadav and Jos{\'e} M. F. Moura and D. Parikh and Dhruv Batra},
  journal={2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2017},
  pages={1080-1089}
}
  • A. Das, S. Kottur, +5 authors Dhruv Batra
  • Published 2017
  • Computer Science
  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • We introduce the task of Visual Dialog, which requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history, and a question about the image, the agent has to ground the question in image, infer context from history, and answer the question accurately. Visual Dialog is disentangled enough from a specific downstream task so as to serve as a general test of machine intelligence, while being… CONTINUE READING
    310 Citations
    Dual Attention Networks for Visual Reference Resolution in Visual Dialog
    • 25
    • Highly Influenced
    • PDF
    Multi-View Attention Networks for Visual Dialog
    • 2
    • Highly Influenced
    • PDF
    CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog
    • 25
    • PDF
    VD-BERT: A Unified Vision and Dialog Transformer with BERT
    • 8
    • Highly Influenced
    • PDF
    Transfer learning for multimodal dialog
    Making History Matter: History-Advantage Sequence Training for Visual Dialog
    • 23
    • Highly Influenced
    • PDF
    Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning
    • 275
    • PDF
    Reasoning Over History: Context Aware Visual Dialog
    Improving Generative Visual Dialog by Answering Diverse Questions
    • 7
    • PDF

    References

    SHOWING 1-10 OF 98 REFERENCES
    Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning
    • 275
    • PDF
    FLIPDIAL: A Generative Model for Two-Way Visual Dialogue
    • 22
    • PDF
    Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images
    • 449
    • PDF
    Visual7W: Grounded Question Answering in Images
    • 435
    • PDF
    Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences
    • 141
    • PDF
    Yin and Yang: Balancing and Answering Binary Visual Questions
    • 173
    • PDF
    Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions
    • 38
    • PDF
    VQA: Visual Question Answering
    • 1,921
    • PDF