• Publications
  • Influence
Image Captioning with Deep Bidirectional LSTMs
TLDR
This work presents an end-to-end trainable deep bidirectional LSTM (Long-Short Term Memory) model for image captioning. Expand
  • 144
  • 12
  • PDF
Punctuation Prediction for Unsegmented Transcript Based on Word Vector
TLDR
In this paper we propose an approach to predict punctuation marks for unsegmented speech transcript. Expand
  • 51
  • 12
  • PDF
Image Captioning with Deep Bidirectional LSTMs and Multi-Task Learning
TLDR
We propose an end-to-end trainable deep bidirectional LSTM (Bi-LSTM) model to address the problem. Expand
  • 47
  • 4
  • PDF
RecSys-DAN: Discriminative Adversarial Networks for Cross-Domain Recommender Systems
TLDR
Data sparsity and data imbalance are practical and challenging issues in cross-domain recommender systems (RSs). Expand
  • 16
  • 1
  • PDF
A deep semantic framework for multimodal representation learning
TLDR
Multimodal representation learning has gained increasing importance in real-world multimedia applications. Expand
  • 4
  • 1
An Improved System For Real-Time Scene Text Recognition
TLDR
In this paper we showcase a system for real-time text detection and recognition of scene text by using a standard PC with webcam. Expand
  • 3
  • 1
Deep Semantic Mapping for Cross-Modal Retrieval
TLDR
In this paper, we propose a regularized deep neural network(RE-DNN) for semantic mapping across modalities. Expand
  • 32
State-Regularized Recurrent Neural Networks
TLDR
We show that state-regularization (a) simplifies the extraction of finite state automata modeling an RNN's state transition dynamics; (b) forces RNNs to operate more like automata with external memory and less like finite state machines. Expand
  • 12
  • PDF
Exploring multimodal video representation for action recognition
TLDR
A video contains rich perceptual information, such as visual appearance, motion and audio, which can be used for understanding the activities in videos. Expand
  • 9
LRMM: Learning to Recommend with Missing Modalities
TLDR
We propose LRMM, a novel framework that mitigates not only the problem of missing modalities but also more generally the cold-start problem of recommender systems. Expand
  • 8
  • PDF