Meta Module Network for Compositional Visual Reasoning

@article{Chen2021MetaMN,
  title={Meta Module Network for Compositional Visual Reasoning},
  author={Wenhu Chen and Zhe Gan and Linjie Li and Yu Cheng and William Wang and Jingjing Liu},
  journal={2021 IEEE Winter Conference on Applications of Computer Vision (WACV)},
  year={2021},
  pages={655-664}
}
Neural Module Network (NMN) exhibits strong interpretability and compositionality thanks to its handcrafted neural modules with explicit multi-hop reasoning capability. However, most NMNs suffer from two critical draw-backs: 1) scalability: customized module for specific function renders it impractical when scaling up to a larger set of functions in complex tasks; 2) generalizability: rigid pre-defined module inventory makes it difficult to generalize to unseen functions in new tasks/domains… Expand
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
SemVLP: Vision-Language Pre-training by Aligning Semantics at Multiple Levels
Object-Centric Diagnosis of Visual Reasoning
How Transferable are Reasoning Patterns in VQA?
Supervising the Transfer of Reasoning Patterns in VQA
VinVL: Making Visual Representations Matter in Vision-Language Models
...
1
2
...

References

SHOWING 1-10 OF 56 REFERENCES
Compositional Attention Networks for Machine Reasoning
Explainable Neural Computation via Stack Neural Module Networks
Learning by Abstraction: The Neural State Machine
Learning to Reason: End-to-End Module Networks for Visual Question Answering
A simple neural network module for relational reasoning
MUREL: Multimodal Relational Reasoning for Visual Question Answering
Neural Module Networks
Inferring and Executing Programs for Visual Reasoning
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering
...
1
2
3
4
5
...