Learning by Asking Questions

@article{Misra2018LearningBA,
  title={Learning by Asking Questions},
  author={Ishan Misra and Ross B. Girshick and Rob Fergus and Martial Hebert and Abhinav Gupta and Laurens van der Maaten},
  journal={2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2018},
  pages={11-20}
}
We introduce an interactive learning framework for the development and testing of intelligent visual systems, called learning-by-asking (LBA). We explore LBA in context of the Visual Question Answering (VQA) task. LBA differs from standard VQA training in that most questions are not observed during training time, and the learner must ask questions it wants answers to. Thus, LBA more closely mimics natural learning and has the potential to be more data-efficient than the traditional VQA setting… Expand
Learning to Caption Images Through a Lifetime by Asking Questions
TLDR
Inspired by a student learning in a classroom, an agent is presented that can continuously learn by posing natural language questions to humans and achieves better performance using less human supervision than the baselines on the challenging MSCOCO dataset. Expand
Learning to Ask for Conversational Machine Learning
TLDR
A reinforcement learning framework that allows learning classifiers from a blend of strategies, including learning from observations, explanations and clarifications, and shows that learned question-asking strategies expedite classifier training by asking appropriate questions at different points in the learning process. Expand
A Competence-aware Curriculum for Visual Concepts Learning via Question Answering
TLDR
This work designs a neural-symbolic concept learner for learning the visual concepts and a multi-dimensional Item Response Theory (mIRT) model for guiding the learning process with an adaptive curriculum. Expand
A Dataset and Baselines for Visual Question Answering on Art
TLDR
This work introduces the first attempt towards building a new dataset, coined AQUA (Art QUestion Answering), where question-answer (QA) pairs are automatically generated using state-of-the-art question generation methods based on paintings and comments provided in an existing art understanding dataset. Expand
Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition
TLDR
This work develops an agent empowered with visual curiosity, i.e. the ability to ask questions to an Oracle and build visual recognition model based on the answers received, and proposes a novel framework and formulate the learning of visual curiosity as a reinforcement learning problem. Expand
Deep Bayesian Active Learning for Multiple Correct Outputs
TLDR
This paper proposes a new paradigm that estimates uncertainty in the model's internal hidden space instead of themodel's output space, and builds a visual-semantic space that embeds paraphrases close together for any existing VQA model. Expand
Knowledge-Routed Visual Question Reasoning: Challenges for Deep Representation Embedding
TLDR
A novel data set named knowledge-routed visual question reasoning is proposed, which aims to cut off the shortcut learning exploited by the current deep embedding models and push the research boundary of the knowledge-based visual question Reasoning. Expand
Visual Question Answering : Datasets , Methods , Challenges and Oppurtunities
TLDR
The most famous datasets, as well as the state-ofthe-art methods for VQA task are explained, and the promising approaches for future research in this area are identified. Expand
Make Up Your Mind: Towards Consistent Answer Predictions in VQA Models
Visual Question Answering (VQA) involves answering natural language questions on images [1]. While state-of-the-art models can answer such questions satisfactorily well on a standard VQA dataset [1],Expand
Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data
TLDR
This work studies a setting called "Dialog without Dialog", which requires agents to develop visually grounded dialog models that can adapt to new tasks without language level supervision, and develops a model that minimizes linguistic drift after fine-tuning for new tasks. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 62 REFERENCES
Revisiting Visual Question Answering Baselines
TLDR
The results suggest that a key problem of current VQA systems lies in the lack of visual grounding and localization of concepts that occur in the questions and answers, and a simple alternative model based on binary classification is developed. Expand
iVQA: Inverse Visual Question Answering
TLDR
This work poses question generation as a multi-modal dynamic inference process and proposes an iVQA model that can gradually adjust its focus of attention guided by both a partially generated question and the answer. Expand
Easy Questions First? A Case Study on Curriculum Learning for Question Answering
TLDR
This work compares a number of curriculum learning proposals in the context of four non-convex models for QA and shows that they lead to real improvements in each of them. Expand
Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions
TLDR
These approaches, based on LSTM-RNNs, VQA model uncertainty, and caption-question similarity, are able to outperform strong baselines on both relevance tasks and are shown to be more intelligent, reasonable, and human-like than previous approaches. Expand
Generating Natural Questions About an Image
TLDR
This paper introduces the novel task of Visual Question Generation, where the system is tasked with asking a natural and engaging question when shown an image, and provides three datasets which cover a variety of images from object-centric to event-centric. Expand
Question Asking as Program Generation
TLDR
A cognitive model capable of constructing human-like questions is introduced that predicts what questions people will ask, and can creatively produce novel questions that were not present in the training set. Expand
VQA: Visual Question Answering
We propose the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image, the task is to provide an accurate natural languageExpand
Learning to Reason: End-to-End Module Networks for Visual Question Answering
TLDR
End-to-End Module Networks are proposed, which learn to reason by directly predicting instance-specific network layouts without the aid of a parser, and achieve an error reduction of nearly 50% relative to state-of-theart attentional approaches. Expand
A Joint Model for Question Answering and Question Generation
TLDR
A generative machine comprehension model that learns jointly to ask and answer questions based on documents that uses a sequence-to-sequence framework that encodes the document and generates a question given an answer. Expand
Yin and Yang: Balancing and Answering Binary Visual Questions
TLDR
This paper addresses binary Visual Question Answering on abstract scenes as visual verification of concepts inquired in the questions by converting the question to a tuple that concisely summarizes the visual concept to be detected in the image. Expand
...
1
2
3
4
5
...