Overlapping experiment infrastructure: more, better, faster experimentation

@article{Tang2010OverlappingEI,
  title={Overlapping experiment infrastructure: more, better, faster experimentation},
  author={Diane Tang and Ashish Agarwal and Deirdre O'Brien and Mike Meyer},
  journal={Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining},
  year={2010}
}
  • Diane Tang, Ashish Agarwal, +1 author Mike Meyer
  • Published 25 July 2010
  • Computer Science
  • Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
At Google, experimentation is practically a mantra; we evaluate almost every change that potentially affects what our users experience. [...] Key Result While the paper specifically describes the experiment system and experimental processes we have in place at Google, we believe they can be generalized and applied by any entity interested in using experimentation to improve search engines and other web applications.Expand
Cost-Aware Stage-Based Experimentation: Challenges and Emerging Results
TLDR
This paper aims for performing experiments that optimize towards their profit while making sure that the overall experimentation cost stays within given bounds, and describes the main concepts behind the method in a semi-formal notation. Expand
Looking at Everything in Context
TLDR
This work presents early work on the Habitat system, an extensible data hosting and management platform for evaluating integration techniques in situ, and describes an initial deployment in a neuroscience setting, including lessons learned in building the platform and community. Expand
Experimentation in the Operating System: The Windows Experimentation Platform
  • P. Li, Pavel A. Dmitriev, +7 authors T. Thoresen
  • Computer Science
  • 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)
  • 2019
TLDR
This paper presents the Windows Experimentation platform (WExp), and insights from implementation and execution of real-world experiments in the OS, and describes the architecture of WExp, focusing on unique considerations in its engineering. Expand
Learning Sensitive Combinations of A/B Test Metrics
TLDR
The problem of finding a sensitive metric combination as a data-driven machine learning problem is formulated and two intuitive optimization approaches are proposed to address it and a considerable sensitivity improvements over the ground-truth metrics can be achieved. Expand
Online controlled experiments at large scale
TLDR
This work discusses why negative experiments, which degrade the user experience short term, should be run, given the learning value and long-term benefits, and designs a highly scalable system able to handle data at massive scale: hundreds of concurrent experiments, each containing millions of users. Expand
Effective Online Evaluation for Web Search
TLDR
A large part of this tutorial is devoted to modern and state-of-the-art techniques that allow to conduct online experimentation efficiently and invites software engineers, designers, analysts, and managers of web services and software products, as well as beginners, advanced specialists, and researchers to learn how to make web service development effectively data-driven. Expand
Automatic Detection and Diagnosis of Biased Online Experiments
We have seen a massive growth of online experiments at LinkedIn, and in industry at large. It is now more important than ever to create an intelligent A/B platform that can truly democratize A/BExpand
Optimised Scheduling of Online Experiments
TLDR
This paper forms the novel problem of schedule optimisation for the queue of the online experiments: given a limited number of the user interactions available for experimentation, it is proposed to re-order the queue so that the number of successful experiments is maximised. Expand
A Tool for Online Experiment-Driven Adaptation
TLDR
OEDA can be a useful vehicle for research in the area of automated experimentation, an emerging challenge where systems are capable of performing experiments to themselves in order to self-optimize. Expand
Leaky Abstraction In Online Experimentation Platforms: A Conceptual Framework To Categorize Common Challenges
TLDR
A conceptual framework is put forward to explicitly categorize experimentation pitfalls in terms of which specific abstraction is leaking, thereby aiding implementers and users of these platforms to better understand and tackle the challenges they face. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 15 REFERENCES
Controlled experiments on the web: survey and practical guide
TLDR
This work provides a practical guide to conducting online experiments, and shares key lessons that will help practitioners in running trustworthy controlled experiments, including statistical power, sample size, and techniques for variance reduction. Expand
Seven pitfalls to avoid when running controlled experiments on the web
TLDR
The pitfalls include a wide range of topics, such as assuming that common statistical formulas used to calculate standard deviation and statistical power can be applied and ignoring robots in analysis (a problem unique to online settings). Expand
Optimizing search engines using clickthrough data
TLDR
The goal of this paper is to develop a method that utilizes clickthrough data for training, namely the query-log of the search engine in connection with the log of links the users clicked on in the presented ranking. Expand
Predicting clicks: estimating the click-through rate for new ads
TLDR
This work shows that it can be used to use features of ads, terms, and advertisers to learn a model that accurately predicts the click-though rate for new ads, and shows that using this model improves the convergence and performance of an advertising system. Expand
Estimating rates of rare events at multiple resolutions
TLDR
On a real-world dataset consisting of 1/2 billion impressions, it is demonstrated that even with 95% negative events in the training set, the method can effectively discriminate extremely rare events in terms of their click propensity. Expand
All of Statistics: A Concise Course in Statistical Inference
TLDR
This book covers a much wider range of topics than a typical introductory text on mathematical statistics, and includes modern topics like nonparametric curve estimation, bootstrapping and classification, topics that are usually relegated to follow-up courses. Expand
The Theory of the Design of Experiments
TLDR
This well-organized book can serve as a cornerstone in a graduate student’s exploration in the theoretical aspects of experimental design and is a valuable reference for statisticians working in medicine, agriculture, the physical sciences, and other areas of biometry and industry. Expand
Wrap up & experimentation
  • Cs147l lecture,
  • 2009
Wrap up & experimentation: Cs147l lecture
  • Wrap up & experimentation: Cs147l lecture
  • 2009
Sampling techniques.
...
1
2
...