Learning Image Embeddings using Convolutional Neural Networks for Improved Multi-Modal Semantics

Abstract

We construct multi-modal concept representations by concatenating a skip-gram linguistic representation vector with a visual concept representation vector computed using the feature extraction layers of a deep convolutional neural network (CNN) trained on a large labeled object recognition dataset. This transfer learning approach brings a clear performance… (More)

6 Figures and Tables

Topics

Statistics

0204020142015201620172018
Citations per Year

106 Citations

Semantic Scholar estimates that this publication has 106 citations based on the available data.

See our FAQ for additional information.

  • Presentations referencing similar topics