VisualTextRank: Unsupervised Graph-based Content Extraction for Automating Ad Text to Image Search

@article{Mishra2021VisualTextRankUG,
  title={VisualTextRank: Unsupervised Graph-based Content Extraction for Automating Ad Text to Image Search},
  author={Shaunak Mishra and Mikhail Kuznetsov and Gaurav Srivastava and Maxim Sviridenko},
  journal={Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery \& Data Mining},
  year={2021}
}
Numerous online stock image libraries offer high quality yet copyright free images for use in marketing campaigns. To assist advertisers in navigating such third party libraries, we study the problem of automatically fetching relevant ad images given the ad text (via a short textual query for images). Motivated by our observations in logged data on ad image search queries (given ad text), we formulate a keyword extraction problem, where a keyword extracted from the ad text (or its augmented… 

Figures and Tables from this paper

TSI: An Ad Text Strength Indicator using Text-to-CTR and Semantic-Ad-Similarity

TLDR
An ad text strength indicator (TSI) which predicts the click-through-rate (CTR) for an input ad text, fetches similar existing ads to create a neighborhood around the input ad, and compares the predicted CTRs in the neighborhood to declare whether theinput ad is strong or weak is proposed.

Recommendation Systems for Ad Creation: A View from the Trenches

TLDR
This talk discusses how state of the art approaches in text mining, ranking, generation, multimodal (visual-linguistic) representations, multilingual text understanding, and recommendation can help to reduce the time spent on designing ads, and showcase their impact on real world advertising systems and metrics.

U-BERT for Fast and Scalable Text-Image Retrieval

TLDR
A U-BERT model is proposed to achieve an effective and efficient cross-modal retrieval of text-image similarity scores based on two independent encoders, with a linear computation complexity.

SuperCone: Modeling Heterogeneous Experts with Concept Meta-learning for Unified Predictive Segments System

TLDR
This work builds on top of a flat concept representation that summarizes each user’s heterogeneous digital footprints, and uniformly models each of the prediction task using an approach called "super learning", that is, combining prediction models with diverse architectures or learning method that are not compatible with each other or even completely unknown.

TSI

References

SHOWING 1-10 OF 25 REFERENCES

Biased TextRank: Unsupervised Graph-Based Content Extraction

TLDR
This work presents two applications of Biased TextRank: focused summarization and explanation extraction, and shows that the algorithm leads to improved performance on two different datasets by significant ROUGE-N score margins.

Recommending Themes for Ad Creative Design via Visual-Linguistic Representations

TLDR
A theme (keyphrase) recommender system for ad creative strategists to automatically infer ad themes via such multimodal sources of information in past ad campaigns is proposed, and it is shown that cross-modal representations lead to significantly better classification accuracy and ranking precision-recall metrics.

Learning to Create Better Ads: Generation and Ranking Approaches for Ad Creative Refinement

TLDR
For generating new ad text, the efficacy of an encoder-decoder architecture with copy mechanism, which allows some words from the (inferior) input text to be copied to the output while incorporating new words associated with higher click-through-rate, is demonstrated.

Automatic Understanding of Image and Video Advertisements

TLDR
The novel problem of automatic advertisement understanding is proposed, and a dataset of 64,832 image ads and 3,477 video ads is created to enable research on this problem.

Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning

We present a new dataset of image caption annotations, Conceptual Captions, which contains an order of magnitude more images than the MS-COCO dataset (Lin et al., 2014) and represents a wider variety

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks

TLDR
This paper proposes a new learning method Oscar (Object-Semantics Aligned Pre-training), which uses object tags detected in images as anchor points to significantly ease the learning of alignments.

ADVISE: Symbolism and External Knowledge for Decoding Advertisements

TLDR
This work forms the ad understanding task as matching the ad image to human-generated statements that describe the action that the ad prompts, and the rationale it provides for taking this action, and proposes a method that outperforms the state of the art on this task.

Image Captioning: Transforming Objects into Words

TLDR
This work introduces the Object Relation Transformer, a approach to image captioning that builds upon this approach by explicitly incorporating information about the spatial relationship between input detected objects through geometric attention.

Understanding Consumer Journey using Attention based Recurrent Neural Networks

TLDR
An attention based recurrent neural network (RNN) which ingests a user activity trail, and predicts the user's conversion probability along with attention weights for each activity (analogous to its position in the funnel) is proposed.

Microsoft COCO: Common Objects in Context

We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene