Imagine This! Scripts to Compositions to Videos

  title={Imagine This! Scripts to Compositions to Videos},
  author={Tanmay Gupta and D. Schwenk and A. Farhadi and Derek Hoiem and Aniruddha Kembhavi},
  • Tanmay Gupta, D. Schwenk, +2 authors Aniruddha Kembhavi
  • Published in ECCV 2018
  • Computer Science
  • Imagining a scene described in natural language with realistic layout and appearance of entities is the ultimate test of spatial, visual, and semantic world knowledge. [...] Key Method Our contributions include sequential training of components of CRAFT while jointly modeling layout and appearances, and losses that encourage learning compositional representations for retrieval. We evaluate CRAFT on semantic fidelity to caption, composition consistency, and visual quality. CRAFT outperforms direct pixel…Expand Abstract

    Figures, Tables, and Topics from this paper.

    StoryGAN: A Sequential Conditional GAN for Story Visualization
    • 25
    • Open Access
    Holistic static and animated 3D scene generation from diverse text descriptions
    Text2Scene: Generating Compositional Scenes From Textual Descriptions
    • 17
    • Open Access
    Towards story-based classification of movie scenes
    Video Object Grounding Using Semantic Roles in Language Description
    PororoGAN: An Improved Story Visualization Model on Pororo-SV Dataset
    Sound2Sight: Generating Visual Dynamics from Sound and Context
    LayoutVAE: Stochastic Scene Layout Generation From a Label Set
    • 14
    • Open Access
    Content Customization for Micro Learning using Human Augmented AI Techniques
    Temporally Coherent Video Harmonization Using Adversarial Networks
    • 5
    • Open Access


    Publications referenced by this paper.
    Attentive Semantic Video Generation Using Captions
    • 19
    • Open Access
    Generating Videos with Scene Dynamics
    • 775
    • Open Access
    Generating Images from Captions with Attention
    • 215
    • Open Access
    Video Generation From Text
    • 47
    • Open Access
    VSE++: Improved Visual-Semantic Embeddings
    • 127
    • Open Access
    DeViSE: A Deep Visual-Semantic Embedding Model
    • 1,412
    • Open Access
    Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis
    • 110
    • Open Access
    Learning Robust Visual-Semantic Embeddings
    • 77
    • Open Access