Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

@article{Wang2019ReinforcedCM,
  title={Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation},
  author={Xin Eric Wang and Qiuyuan Huang and A. Çelikyilmaz and Jianfeng Gao and Dinghan Shen and Y. Wang and William Yang Wang and Lei Zhang},
  journal={2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2019},
  pages={6622-6631}
}
  • Xin Eric Wang, Qiuyuan Huang, +5 authors Lei Zhang
  • Published 2019
  • Computer Science
  • 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • Vision-language navigation (VLN) is the task of navigating an embodied agent to carry out natural language instructions inside real 3D environments. [...] Key Method Particularly, a matching critic is used to provide an intrinsic reward to encourage global matching between instructions and trajectories, and a reasoning navigator is employed to perform cross-modal grounding in the local visual scene. Evaluation on a VLN benchmark dataset shows that our RCM model significantly outperforms previous methods by 10…Expand Abstract
    124 Citations
    Vision-Language Navigation Policy Learning and Adaptation.
    • 1
    • PDF
    Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation
    • Juncheng Li, X. Wang, +4 authors William Yang Wang
    • Computer Science
    • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
    • 2020
    • 4
    • PDF
    Language-guided Navigation via Cross-Modal Grounding and Alternate Adversarial Learning
    • Highly Influenced
    • PDF
    Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-Training
    • 21
    • PDF
    Vision-Based Navigation With Language-Based Assistance via Imitation Learning With Indirect Intervention
    • 26
    • PDF
    Multi-View Learning for Vision-and-Language Navigation
    • 3
    • Highly Influenced
    • PDF
    Vision-Language Navigation With Self-Supervised Auxiliary Reasoning Tasks
    • 26
    • Highly Influenced
    • PDF
    Multi-modal Discriminative Model for Vision-and-Language Navigation
    • 7
    • PDF
    Transferable Representation Learning in Vision-and-Language Navigation
    • 20
    • PDF
    Soft Expert Reward Learning for Vision-and-Language Navigation
    • 2
    • Highly Influenced
    • PDF

    References

    SHOWING 1-10 OF 69 REFERENCES
    Vision-Based Navigation With Language-Based Assistance via Imitation Learning With Indirect Intervention
    • 26
    • PDF
    Target-driven visual navigation in indoor scenes using deep reinforcement learning
    • 730
    • PDF
    The Regretful Agent: Heuristic-Aided Navigation Through Progress Estimation
    • 49
    • PDF
    Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments
    • 312
    • PDF
    Visual Representations for Semantic Target Driven Navigation
    • 65
    • PDF
    Self-Monitoring Navigation Agent via Auxiliary Progress Estimation
    • 66
    • PDF
    Tactical Rewind: Self-Correction via Backtracking in Vision-And-Language Navigation
    • 49
    • PDF
    Speaker-Follower Models for Vision-and-Language Navigation
    • 123
    • PDF
    Learning models for following natural language directions in unknown environments
    • 75
    • PDF