Learn More
Visual Question Answering (VQA) is the task of answering natural-language questions about images. We introduce the novel problem of determining the relevance of questions to images in VQA. Current VQA models do not reason about whether a question is even related to the given image (e.g. What is the capital of Argentina?) or if it requires information from(More)
This paper presents the Virginia Tech system that participated in the CoNLL-2016 shared task on shallow discourse parsing. We describe our end-to-end discourse parser that builds on the methods shown to be successful in previous work. The system consists of several components, such that each module performs a specific sub-task, and the components are(More)
Cover song identification is a trivial problem for human beings. A human can easily identify whether a song is a cover of another or not, without using much effort. However, cover song identification is not an easy task for machines , as the numerical audio data differs significantly. Since a major goal of Machine Learning and Artificial Intelligence is to(More)
  • 1