phi-LSTM: A Phrase-Based Hierarchical LSTM Model for Image Captioning

@article{Tan2016phiLSTMAP,
  title={phi-LSTM: A Phrase-Based Hierarchical LSTM Model for Image Captioning},
  author={Ying Hua Tan and Chee Seng Chan},
  journal={ArXiv},
  year={2016},
  volume={abs/1608.05813}
}
A picture is worth a thousand words. Not until recently, however, we noticed some success stories in understanding of visual scenes: a model that is able to detect/name objects, describe their attributes, and recognize their relationships/interactions. In this paper, we propose a phrase-based hierarchical Long Short-Term Memory (phi-LSTM) model to generate image description. The proposed model encodes sentence as a sequence of combination of phrases and words, instead of a sequence of words… Expand
25 Citations
Phrase-based Image Captioning with Hierarchical LSTM Model
  • 3
  • PDF
Phrase-based image caption generator with hierarchical LSTM network
  • 11
  • PDF
CNN+CNN: Convolutional Decoders for Image Captioning
  • 37
  • PDF
Gated Hierarchical Attention for Image Captioning
  • 9
  • PDF
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition
  • 67
  • PDF
Topic sensitive image descriptions
COMIC: Toward A Compact Image Captioning Model With Attention
  • 11
  • PDF
SemStyle: Learning to Generate Stylised Image Captions Using Unaligned Text
  • 49
  • PDF
Deep Hierarchical Encoder–Decoder Network for Image Captioning
  • 13
...
1
2
3
...

References

SHOWING 1-10 OF 42 REFERENCES
Phrase-based Image Captioning
  • 91
  • PDF
Show and tell: A neural image caption generator
  • 3,810
  • Highly Influential
  • PDF
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
  • 888
  • PDF
Mind's eye: A recurrent visual representation for image caption generation
  • 394
  • PDF
Grounded Compositional Semantics for Finding and Describing Images with Sentences
  • 718
  • Highly Influential
  • PDF
Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models
  • 922
  • PDF
Deep Visual-Semantic Alignments for Generating Image Descriptions
  • A. Karpathy, Li Fei-Fei
  • Computer Science, Medicine
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2017
  • 1,883
  • Highly Influential
  • PDF
...
1
2
3
4
5
...