Image Captioning with Clause-Focused Metrics in a Multi-modal Setting for Marketing
@article{Harzig2019ImageCW, title={Image Captioning with Clause-Focused Metrics in a Multi-modal Setting for Marketing}, author={Philipp Harzig and D. Zecha and R. Lienhart and C. Kaiser and Ren{\'e} Schallner}, journal={2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)}, year={2019}, pages={419-424} }
Automatically generating descriptive captions for images is a well-researched area in computer vision. However, existing evaluation approaches focus on measuring the similarity between two sentences disregarding fine-grained semantics of the captions. In our setting of images depicting persons interacting with branded products, the subject, predicate, object and the name of the branded product are important evaluation criteria of the generated captions. Generating image captions with these… CONTINUE READING
Supplemental Code
One Citation
Towards Better Graph Representation: Two-Branch Collaborative Graph Neural Networks For Multimodal Marketing Intention Detection
- Computer Science
- 2020 IEEE International Conference on Multimedia and Expo (ICME)
- 2020
- PDF
References
SHOWING 1-10 OF 14 REFERENCES
Multimodal Image Captioning for Marketing Analysis
- Computer Science
- 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)
- 2018
- 10
- PDF
Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge
- Computer Science, Medicine
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- 2017
- 497
- Highly Influential
- PDF
CIDEr: Consensus-based image description evaluation
- Computer Science
- 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2015
- 1,492
- PDF
Show and tell: A neural image caption generator
- Computer Science
- 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2015
- 3,701
- Highly Influential
- PDF
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
- Computer Science
- 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2016
- 757
- PDF
Deep Visual-Semantic Alignments for Generating Image Descriptions
- Computer Science, Medicine
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- 2017
- 1,782
- PDF
Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models
- Computer Science
- ArXiv
- 2014
- 885
- PDF
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
- Computer Science
- 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
- 214
- PDF