Practical Lessons from Predicting Clicks on Ads at Facebook

  title={Practical Lessons from Predicting Clicks on Ads at Facebook},
  author={Xinran He and Junfeng Pan and Ou Jin and Tianbing Xu and Bo Liu and Tao Xu and Yanxin Shi and Antoine Atallah and Ralf Herbrich and Stuart Bowers and Joaquin Qui{\~n}onero Candela},
  booktitle={International Workshop on Data Mining for Online Advertising},
Online advertising allows advertisers to only bid and pay for measurable user responses, such as clicks on ads. [] Key Result Picking the optimal handling for data freshness, learning rate schema and data sampling improve the model slightly, though much less than adding a high-value feature, or picking the right model to begin with.

Modern Models for Learning Large-Scale Highly Skewed Online Advertising Data

A comprehensive summary of the state-of-art machine learning models (decision tree based, regularized logistic regression, online learning, and factorization machine) that are often used in the industry to solve the problem of click through rate and conversation rate estimation.

User Response Learning for Directly Optimizing Campaign Performance in Display Advertising

This paper reformulates a common logistic regression CTR model by putting it back into its subsequent bidding context: rather than minimizing the prediction error, the model parameters are learned directly by optimizing campaign profit.

User Response Prediction in Online Advertising

A taxonomy is proposed to categorize state-of-the-art user response prediction methods, primarily focusing on the current progress of machine learning methods used in different online platforms, and applications of user response Prediction, benchmark datasets, and open source codes in the field are reviewed.

Click-through Prediction for Advertising in Twitter Timeline

A learning-to-rank method is proposed which not only addresses the sparsity of training signals but also can be trained and updated online and its superiority over the current production model adopted by Twitter is demonstrated.

Research on Click-Through Rate Prediction in Display Advertising Based on Machine Learning

This paper finds that the logistic regression model, the random forest model, and the gradient lifting decision tree model are the most suitable machine learning models for solving the problem of advertising click rate prediction.

Click Maximization in Online Social Networks Using Optimal Choice of Targeted Interests

A greedy algorithm and a genetic algorithm are proposed to find near-optimal combinations of conceptual nodes in polynomial time, with the genetic algorithm nearly matching the optimal solution.

Automatic learning explicit and implicit feature interactions for click-through rate prediction

AFN is proposed, which explicitly models feature combinations of different orders using multi-headed self-attentive networks with different levels of residual connectivity and introduces an adaptive factorization network to learn crossover features of arbitrary order.

An Embedded Model XG-FwFMs for Click-Through Rate

A embedded model named XG-FwFMs which use less parameters calculating and prevent the model from over-fitting is proposed which has better prediction accuracy, parameter sensitivity and effectiveness than traditional nonlinear models.

Attention Convolutional Neural Network for Advertiser-level Click-through Rate Forecasting

This work proposes a novel context-aware attention convolutional neural network (CACNN), which can capture the high non-linearity and local information of the time series, as well as the underlying correlation between the timeseries of CTR and the contextual information.

Feature Engineering of Click-through-rate Prediction for Advertising

This paper proposes some feature engineering methods based on gradient boosting decision tree (GBDT) and Bayesian smoothing to obtain a wonderful feature, which has more useful information and is not so sparse.



Ad click prediction: a view from the trenches

The goal of this paper is to highlight the close relationship between theoretical advances and practical engineering in this industrial setting, and to show the depth of challenges that appear when applying traditional machine learning methods in a complex dynamic system.

Predicting clicks: estimating the click-through rate for new ads

This work shows that it can be used to use features of ads, terms, and advertisers to learn a model that accurately predicts the click-though rate for new ads, and shows that using this model improves the convergence and performance of an advertising system.

Predictive model performance: offline and online evaluations

A new model evaluation paradigm is designed that simulates the online behavior of predictive models and results are highly promising on click prediction model for search advertising.

Web-Scale Bayesian Click-Through rate Prediction for Sponsored Search Advertising in Microsoft's Bing Search Engine

A new Bayesian click-through rate (CTR) prediction algorithm used for Sponsored Search in Microsoft's Bing search engine is described, based on a probit regression model that maps discrete or real-valued input features to probabilities.

Data warehousing and analytics infrastructure at facebook

This paper presents how Scribe, Hadoop and Hive together form the cornerstones of the log collection, storage and analytics infrastructure at Facebook and enabled us to implement a data warehouse that stores more than 15PB of data and loads more than 60TB of new data every day.

Greedy function approximation: A gradient boosting machine.

A general gradient descent boosting paradigm is developed for additive expansions based on any fitting criterion, and specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification.

An Empirical Evaluation of Thompson Sampling

Empirical results using Thompson sampling on simulated and real data are presented, and it is shown that it is highly competitive and should be part of the standard baselines to compare against.

Photon: fault-tolerant and scalable joining of continuous data streams

The architecture of Photon is described, a geographically distributed system for joining multiple continuously flowing streams of data in real-time with high scalability and low latency, where the streams may be unordered or delayed.

Processing Sliding Window Multi-Joins in Continuous Queries over Data Streams

Adaptive Algorithms and Stochastic Approximations

The juxtaposition of these two expressions in the title reflects the ambition of the authors to produce a reference work, both for engineers who use adaptive algorithms and for probabilists or statisticians who would like to study stochastic approximations in terms of problems arising from real applications.