Yin and Yang: Balancing and Answering Binary Visual Questions

The complex compositional structure of language makes problems at the intersection of vision and language challenging. But language also provides a strong prior that can result in good superficial performance, without the underlying models truly understanding the visual content. This can hinder progress in pushing state of art in the computer vision aspects… CONTINUE READING

13 Figures & Tables

Topics

Statistics

0204060201620172018
Citations per Year

69 Citations

Semantic Scholar estimates that this publication has 69 citations based on the available data.

See our FAQ for additional information.