Beyond task success: A closer look at jointly learning to see, ask, and GuessWhat

@inproceedings{Shekhar2019BeyondTS,
  title={Beyond task success: A closer look at jointly learning to see, ask, and GuessWhat},
  author={Ravi Shekhar and Aashish Venkatesh and Tim Baumg{\"a}rtner and Elia Bruni and Barbara Plank and R. Bernardi and R. Fern{\'a}ndez},
  booktitle={NAACL-HLT},
  year={2019}
}
  • Ravi Shekhar, Aashish Venkatesh, +4 authors R. Fernández
  • Published in NAACL-HLT 2019
  • Computer Science, Psychology
  • We propose a grounded dialogue state encoder which addresses a foundational issue on how to integrate visual grounding with dialogue system components. As a test-bed, we focus on the GuessWhat?! game, a two-player game where the goal is to identify an object in a complex visual scene by asking a sequence of yes/no questions. Our visually-grounded encoder leverages synergies between guessing and asking questions, as it is trained jointly using multi-task learning. We further enrich our model via… CONTINUE READING
    18 Citations
    Imagining Grounded Conceptual Representations from Perceptual Information in Situated Guessing Games
    • Highly Influenced
    • PDF
    An Empirical Study on the Generalization Power of Neural Representations Learned via Visual Guessing Games
    • Highly Influenced
    • PDF
    Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue
    • Highly Influenced
    • PDF
    Jointly Learning to See, Ask, Decide when to Stop, and then GuessWhat
    • 2
    • PDF
    They are not all alike: answering different spatial questions requires different grounding strategies
    • Highly Influenced
    • PDF
    Guessing State Tracking for Visual Dialogue
    • 2
    • PDF
    On the role of effective and referring questions in GuessWhat?!
    • 1
    • PDF

    References

    SHOWING 1-10 OF 38 REFERENCES
    GuessWhat?! Visual Object Discovery through Multi-modal Dialogue
    • 233
    • Highly Influential
    • PDF
    Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning
    • 287
    • PDF
    Interactive Reinforcement Learning for Object Grounding via Self-Talking
    • 7
    • PDF
    Visual Dialog
    • A. Das, S. Kottur, +5 authors Dhruv Batra
    • Computer Science
    • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
    • 2017
    • 339
    • Highly Influential
    • PDF
    End-to-end optimization of goal-driven and visually grounded dialogue systems
    • 98
    • Highly Influential
    • PDF
    Deep Reinforcement Learning for Dialogue Generation
    • 796
    • PDF
    Ask No More: Deciding when to guess in referential visual dialogue
    • 14
    • PDF
    Towards End-to-End Learning for Dialog State Tracking and Management using Deep Reinforcement Learning
    • 186
    • PDF