Corpus ID: 232335373

VLGrammar: Grounded Grammar Induction of Vision and Language

@article{Hong2021VLGrammarGG,
  title={VLGrammar: Grounded Grammar Induction of Vision and Language},
  author={Yining Hong and Qing Li and Song-Chun Zhu and Siyuan Huang},
  journal={ArXiv},
  year={2021},
  volume={abs/2103.12975}
}
  • Yining Hong, Qing Li, +1 author Siyuan Huang
  • Published 2021
  • Computer Science
  • ArXiv
Cognitive grammar suggests that the acquisition of language grammar is grounded within visual structures. While grammar is an essential representation of natural language, it also exists ubiquitously in vision to represent the hierarchical part-whole structure. In this work, we study grounded grammar induction of vision and language in a joint learning framework. Specifically, we present VLGrammar, a method that uses compound probabilistic contextfree grammars (compound PCFGs) to induce the… Expand

Figures and Tables from this paper

References

SHOWING 1-10 OF 62 REFERENCES
Visually Grounded Neural Syntax Acquisition
Defining and Parsing Visual Languages with Layered Graph Grammars
Neural Language Modeling by Jointly Learning Syntax and Lexicon
Unsupervised Recurrent Neural Network Grammars
Deep Visual-Semantic Alignments for Generating Image Descriptions
  • A. Karpathy, Li Fei-Fei
  • Computer Science, Medicine
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2017
Compound Probabilistic Context-Free Grammars for Grammar Induction
A Generative Constituent-Context Model for Improved Grammar Induction
...
1
2
3
4
5
...