From phonemes to images: levels of representation in a recurrent neural model of visually-grounded language learning

@article{Gelderloos2016FromPT,
  title={From phonemes to images: levels of representation in a recurrent neural model of visually-grounded language learning},
  author={Lieke Gelderloos and Grzegorz Chrupala},
  journal={ArXiv},
  year={2016},
  volume={abs/1610.03342}
}
We present a model of visually-grounded language learning based on stacked gated recurrent neural networks which learns to predict visual features given an image description in the form of a sequence of phonemes. The learning task resembles that faced by human language learners who need to discover both structure and meaning from noisy and ambiguous data across modalities. We show that our model indeed learns to predict features of the visual context given phonetically transcribed image… CONTINUE READING
12
Twitter Mentions

Similar Papers

Figures, Tables, and Topics from this paper.

Explore Further: Topics Discussed in This Paper

Citations

Publications citing this paper.
SHOWING 1-10 OF 15 CITATIONS

Imagination improves Multimodal Translation

VIEW 5 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

Analysis Methods in Neural Language Processing: A Survey

  • Transactions of the Association for Computational Linguistics
  • 2018

References

Publications referenced by this paper.
SHOWING 1-10 OF 39 REFERENCES