Scene Text Visual Question Answering

@article{Biten2019SceneTV,
  title={Scene Text Visual Question Answering},
  author={Ali Furkan Biten and Ruben Tito and Andr{\'e}s Mafla and Llu{\'i}s G{\'o}mez and M. Rusi{\~n}ol and Ernest Valveny and C. V. Jawahar and Dimosthenis Karatzas},
  journal={2019 IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2019},
  pages={4290-4300}
}
Current visual question answering datasets do not consider the rich semantic information conveyed by text within an image. [...] Key Method We propose a new evaluation metric for these tasks to account both for reasoning errors as well as shortcomings of the text recognition module. In addition we put forward a series of baseline methods, which provide further insight to the newly released dataset, and set the scene for further research.Expand
31 Citations
Multimodal grid features and cell pointers for Scene Text Visual Question Answering
  • 1
  • PDF
Finding the Evidence: Localization-aware Answer Prediction for Text Visual Question Answering
  • 2
  • Highly Influenced
  • PDF
On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering
  • Xinyu Wang, Y. Liu, +6 authors L. Wang
  • Computer Science
  • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
  • 10
  • Highly Influenced
  • PDF
ICDAR 2019 Competition on Scene Text Visual Question Answering
  • 16
  • PDF
Structured Multimodal Attentions for TextVQA
  • 3
  • Highly Influenced
  • PDF
Cascade Reasoning Network for Text-based Visual Question Answering
  • 1
  • Highly Influenced
  • PDF
Real-time Lexicon-free Scene Text Retrieval
  • 2
Document Visual Question Answering Challenge 2020
  • PDF
TextCaps: a Dataset for Image Captioning with Reading Comprehension
  • 12
  • Highly Influenced
  • PDF
RUArt: A Novel Text-Centered Solution for Text-Based Visual Question Answering
  • Highly Influenced
  • PDF
...
1
2
3
4
...

References

SHOWING 1-10 OF 72 REFERENCES
ICDAR 2019 Competition on Scene Text Visual Question Answering
  • 16
  • PDF
Exploring Models and Data for Image Question Answering
  • 466
  • PDF
Visual Madlibs: Fill in the Blank Description Generation and Question Answering
  • 101
  • PDF
Dynamic Lexicon Generation for Natural Scene Images
  • 10
  • PDF
Single Shot Scene Text Retrieval
  • 18
  • PDF
Stacked Attention Networks for Image Question Answering
  • 1,188
  • Highly Influential
  • PDF
Towards VQA Models That Can Read
  • 89
  • Highly Influential
  • PDF
Yin and Yang: Balancing and Answering Binary Visual Questions
  • 190
  • PDF
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
  • 744
  • PDF
Image Retrieval Using Textual Cues
  • 51
  • PDF
...
1
2
3
4
5
...