Advancing weakly supervised cross-domain alignment with optimal transport

  author={Siyang Yuan and Ke Bai and Liqun Chen and Yizhe Zhang and Chenyang Tao and Chunyuan Li and Guoyin Wang and Ricardo Henao and Lawrence Carin},
Cross-domain alignment between image objects and text sequences is key to many visual-language tasks, and it poses a fundamental challenge to both computer vision and natural language processing. This paper investigates a novel approach for the identification and optimization of fine-grained semantic similarities between image and text entities, under a weakly-supervised setup, improving performance over state-of-the-art solutions. Our method builds upon recent advances in optimal transport (OT… Expand
