Scaling TensorFlow to 300 million predictions per second

@article{Hartman2021ScalingTT,
  title={Scaling TensorFlow to 300 million predictions per second},
  author={Jan Hartman and Davorin Kopic},
  journal={Proceedings of the 15th ACM Conference on Recommender Systems},
  year={2021}
}
  • Jan Hartman, Davorin Kopic
  • Published 13 September 2021
  • Computer Science
  • Proceedings of the 15th ACM Conference on Recommender Systems
We present the process of transitioning machine learning models to the TensorFlow framework at a large scale in an online advertising ecosystem. In this talk we address the key challenges we faced and describe how we successfully tackled them; notably, implementing the models in TF and serving them efficiently with low latency using various optimization techniques. 

Exploration with Model Uncertainty at Extreme Scale in Real-Time Bidding

TLDR
A scalable and efficient system for exploring the supply landscape in real-time bidding based on the predictive uncertainty of models used for click-through rate prediction and works in a high-throughput, low-latency environment.

References

SHOWING 1-7 OF 7 REFERENCES

DeepFM: A Factorization-Machine based Neural Network for CTR Prediction

TLDR
This paper shows that it is possible to derive an end-to-end learning model that emphasizes both low- and high-order feature interactions, and combines the power of factorization machines for recommendation and deep learning for feature learning in a new neural network architecture.

Ad click prediction: a view from the trenches

TLDR
The goal of this paper is to highlight the close relationship between theoretical advances and practical engineering in this industrial setting, and to show the depth of challenges that appear when applying traditional machine learning methods in a complex dynamic system.

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

TLDR
The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.

Practical Lessons from Predicting Clicks on Ads at Facebook

TLDR
This paper introduces a model which combines decision trees with logistic regression, outperforming either of these methods on its own by over 3%, an improvement with significant impact to the overall system performance.

Factorization Machines

  • Steffen Rendle
  • Computer Science
    2010 IEEE International Conference on Data Mining
  • 2010
TLDR
Factorization Machines (FM) are introduced which are a new model class that combines the advantages of Support Vector Machines (SVM) with factorization models and can mimic these models just by specifying the input data (i.e. the feature vectors).

The Go Programming Language

TLDR
Andrew Gerrand explains how Go intends to simplify problems which have been motifs as Google has scaled, including the details behind how Go was conceived and how the open source community contributes to it.

Display Advertising with Real-Time Bidding (RTB) and Behavioural Targeting

TLDR
Topics covered include user response prediction, bid landscape forecasting, bidding algorithms, revenue optimization, statistical arbitrage, dynamic pricing, and ad fraud detection are an invaluable text for researchers and practitioners alike.