• Publications
  • Influence
MOCA: A Modular Object-Centric Approach for Interactive Instruction Following
TLDR
This work proposes a modular architecture that decouples the task into visual perception and action policy, and name it as MOCA, a Modular Object-Centric Approach, and empirically validate that it outperforms prior arts by significant margins in all metrics with good generalization performance.
Agent with the Big Picture: Perceiving Surroundings for Interactive Instruction Following
TLDR
A model factorizing interactive perception and action policy in separate streams in a unified end-to-end framework is designed, which outperforms the previous challenge winner method.
A Fast, Scalable, and Reliable Deghosting Method for Extreme Exposure Fusion
TLDR
This work proposes a simple, yet effective CNN-based multi-exposure image fusion method that produces artifact-free HDR images, and offers a speed-up of around 54× over existing state-of-the-art HDR fusion methods.
Learning Architectures for Binary Networks
TLDR
This work proposes to search architectures for binary networks (BNAS) by defining a new search space for binary architectures and a novel search objective, and designs a new cell template and proposes to use the Zeroise layer instead of using it as a placeholder.
Factorizing Perception and Policy for Interactive Instruction Following
TLDR
A model that factorizes the ‘interactive instruction following’ task into interactive perception and action policy streams with enhanced components is proposed and empirically validate that MOCA outperforms prior arts by significant margins on the ALFRED benchmark with improved generalization.
BNAS v2: Learning Architectures for Binary Networks with Empirical Improvements
TLDR
This work proposes to search architectures for binary networks (BNAS) by defining a new search space for binary architectures and a novel search objective, and designs a new cell template and proposes to use the Zeroise layer instead of using it as a placeholder.