Online controlled experiments at large scale

@article{Kohavi2013OnlineCE,
  title={Online controlled experiments at large scale},
  author={Ron Kohavi and Alex Deng and Brian Frasca and Toby Walker and Ya Xu and Nils Pohlmann},
  journal={Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining},
  year={2013}
}
  • Ron Kohavi, Alex Deng, N. Pohlmann
  • Published 11 August 2013
  • Computer Science
  • Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Web-facing companies, including Amazon, eBay, Etsy, Facebook, Google, Groupon, Intuit, LinkedIn, Microsoft, Netflix, Shop Direct, StumbleUpon, Yahoo, and Zynga use online controlled experiments to guide product development and accelerate innovation. [] Key Result The system has also identified many negative features that we avoided deploying, despite key stakeholders' early excitement, saving us similar large amounts.

Figures from this paper

Top Challenges from the first Practical Online Controlled Experiments Summit
TLDR
The first paper to provide the top challenges faced across the industry for running OCEs at scale and some common solutions is provided.
A/B Testing at Scale: Accelerating Software Innovation
TLDR
The goal in this tutorial is to teach attendees how to scale experimentation for their teams, products, and companies, leading to better data-driven decisions and to inspire more academic research in the relatively new and rapidly evolving field of online controlled experimentation.
Seven rules of thumb for web site experimenters
TLDR
Seven rules of thumb for experimenters are shared that have broad applicability in web optimization and analytics outside of controlled experiments, yet they are not provably correct, and in some cases exceptions are known.
From Infrastructure to Culture: A/B Testing Challenges in Large Scale Social Networks
TLDR
The experimentation platform at LinkedIn is described in depth and how it is built to handle each step of the A/B testing process at LinkedIn, from designing and deploying experiments to analyzing them.
Online Controlled Experiments and A / B Tests
TLDR
Online controlled experiments are now considered an indispensable tool, and their use is growing for startups and smaller websites, especially in combination with Agile software development.
A Dirty Dozen: Twelve Common Metric Interpretation Pitfalls in Online Controlled Experiments
TLDR
This paper shares twelve common metric interpretation pitfalls, illustrating each pitfall with a puzzling example from a real experiment, and describes processes, metric design principles, and guidelines that can be used to detect and avoid the pitfall.
APONE: Academic Platform for ONline Experiments
TLDR
APONE is developed and open sourced, an Academic Platform for ONline Experiments that uses PlanOut, a framework and high-level language to specify online experiments, and offers Web services and a Web GUI to easily create, manage and monitor them.
A/B Testing at Scale: Accelerating Software Innovation
TLDR
This tutorial will introduce the overall A/B testing methodology, walkthrough use cases using real examples, and then focus on practical and research challenges in scaling experimentation.
Online randomized controlled experiments at scale: lessons and extensions to medicine
TLDR
Key scaling lessons learned in the technology field, including a focus on metrics, an overall evaluation criterion and thousands of metrics for insights and debugging, automatically computed for every experiment are presented.
Trustworthy Online Controlled Experiments
TLDR
This practical guide by experimentation leaders at Google, LinkedIn, and Microsoft will teach you how to accelerate innovation using trustworthy online controlled experiments, or A/B tests, to improve the way they make data-driven decisions.
...
...

References

SHOWING 1-10 OF 85 REFERENCES
Improving the sensitivity of online controlled experiments by utilizing pre-experiment data
TLDR
This work proposes an approach (CUPED) that utilizes data from the pre-experiment period to reduce metric variability and hence achieve better sensitivity in experiments, applicable to a wide variety of key business metrics.
Online controlled experiments: introduction, learnings, and humbling statistics
The web provides an unprecedented opportunity to accelerate innovation by evaluating ideas quickly and accurately using controlled experiments (e.g., A/B tests and their generalizations). Whether for
Trustworthy online controlled experiments: five puzzling outcomes explained
TLDR
The topics covered include: the OEC (Overall Evaluation Criterion), click tracking, effect trends, experiment length and power, and carryover effects, which should help readers increase the trustworthiness of the results coming out of controlled experiments.
Online Experimentation at Microsoft
TLDR
The goal of this paper is to share lessons and challenges focused more on the cultural aspects and the value of controlled experiments.
Controlled experiments on the web: survey and practical guide
TLDR
This work provides a practical guide to conducting online experiments, and shares key lessons that will help practitioners in running trustworthy controlled experiments, including statistical power, sample size, and techniques for variance reduction.
Seven pitfalls to avoid when running controlled experiments on the web
TLDR
The pitfalls include a wide range of topics, such as assuming that common statistical formulas used to calculate standard deviation and statistical power can be applied and ignoring robots in analysis (a problem unique to online settings).
Overlapping experiment infrastructure: more, better, faster experimentation
TLDR
Google's overlapping experiment infrastructure is described, and the associated tools and educational processes required to use it effectively are discussed, which can be generalized and applied by any entity interested in using experimentation to improve search engines and other web applications.
Unexpected results in online controlled experiments
TLDR
This work shares several real examples of unexpected results and lessons learned from online controlled experiments, being used frequently, utilizing software capabilities like ramp-up (exposure control) and running experiments on large server farms with millions of users.
Do It Wrong Quickly: How the Web Changes the Old Marketing Rules
"What's the one thing companies care about? Conversion. Getting potential customers to convert into real, actual, customers. But how do you do that in a world of Facebook, Google, YouTube, blogs, and
SCOPE: parallel databases meet MapReduce
TLDR
A distributed computation system, Structured Computations Optimized for Parallel Execution (Scope), targeted for this type of massive data analysis, which combines benefits from both traditional parallel databases and MapReduce execution engines to allow easy programmability and deliver massive scalability and high performance through advanced optimization.
...
...