Language-Mediated, Object-Centric Representation Learning
@article{Wang2020LanguageMediatedOR, title={Language-Mediated, Object-Centric Representation Learning}, author={Ruocheng Wang and Jiayuan Mao and S. Gershman and Jiajun Wu}, journal={ArXiv}, year={2020}, volume={abs/2012.15814} }
We present Language-mediated, Object-centric Representation Learning (LORL), a paradigm for learning disentangled, object-centric scene representations from vision and language. LORL builds upon recent advances in unsupervised object segmentation, notably MONet and Slot Attention. While these algorithms learn an object-centric representation just by reconstructing the input image, LORL enables them to further learn to associate the learned representations to concepts, i.e., words for object… Expand