Corpus ID: 220280096

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

@article{Zhu2021MultimodalTS,
  title={Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation},
  author={Wanrong Zhu and Xin Wang and Tsu-Jui Fu and An Yan and P. Narayana and Kazoo Sone and Sugato Basu and W. Wang},
  journal={ArXiv},
  year={2021},
  volume={abs/2007.00229}
}
One of the most challenging topics in Natural Language Processing (NLP) is visually-grounded language understanding and reasoning. Outdoor vision-and-language navigation (VLN) is such a task where an agent follows natural language instructions and navigates in real-life urban environments. With the lack of human-annotated instructions that illustrate the intricate urban scenes, outdoor VLN remains a challenging task to solve. In this paper, we introduce a Multimodal Text Style Transfer (MTST… Expand
Diagnosing Vision-and-Language Navigation: What Really Matters
Pathdreamer: A World Model for Indoor Navigation

References

SHOWING 1-10 OF 81 REFERENCES
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-Training
Self-Monitoring Navigation Agent via Auxiliary Progress Estimation
Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout
The Regretful Agent: Heuristic-Aided Navigation Through Progress Estimation
Learning to Navigate in Cities Without a Map
...
1
2
3
4
5
...