Estimating Position Bias without Intrusive Interventions

  title={Estimating Position Bias without Intrusive Interventions},
  author={Aman Agarwal and Ivan Zaitsev and Xuanhui Wang and Cheng Li and Marc Najork and Thorsten Joachims},
  journal={Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining},
  • Aman AgarwalI. Zaitsev T. Joachims
  • Published 12 December 2018
  • Computer Science, Economics
  • Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining
Presentation bias is one of the key challenges when learning from implicit feedback in search engines, as it confounds the relevance signal. [] Key Method First, we show how to harvest a specific type of intervention data from historic feedback logs of multiple different ranking functions, and show that this data is sufficient for consistent propensity estimation in the position-based model. Second, we propose a new extremum estimator that makes effective use of this data. In an empirical evaluation, we find…

Figures and Tables from this paper

Counterfactual Learning to Rank using Heterogeneous Treatment Effect Estimation

This work employs heterogeneous treatment effect estimation techniques to estimate position bias when intervention click data is limited and uses such estimations to debias the observed click distribution and re-draw a new de-biased data set, which can be used for any LTR algorithms.

Doubly-Robust Estimation for Correcting Position-Bias in Click Feedback for Unbiased Learning to Rank

This paper introduces a novel DR estimator that is the first DR approach specifically designed for position-bias, and contributes both increases in state-of-the-art performance and the most robust theoretical guarantees of all known LTR estimators.

Intervention Harvesting for Context-Dependent Examination-Bias Estimation

A Contextual Position-Based Model (CPBM) where the examination bias may also depend on a context vector describing the query and the user is proposed, and an effective estimator for the CPBM based on intervention harvesting is proposed.

Doubly-Robust Estimation for Unbiased Learning-to-Rank from Position-Biased Click Feedback

Clicks on rankings suffer from position bias: generally items on lower ranks are less likely to be examined – and thus clicked – by users, in spite of their actual preferences between items. The

Ranker-agnostic Contextual Position Bias Estimation

This paper introduces a method for modeling the probability of an item being seen in different contexts, e.g., for different users, with a single estimator, and indicates that the method introduced outperforms other existing position bias estimators in terms of relative error when the examination probability varies across queries.

Position Bias Estimation for Unbiased Learning-to-Rank in eCommerce Search

A novel method to directly estimate propensities which does not use any intervention in live search or rely on modeling relevance, and is applied to any search engine for which the rank of the same document may naturally change over time for the same query.

Direct Estimation of Position Bias for Unbiased Learning-to-Rank without Intervention

A novel method to directly estimate propensities which does not use any intervention in live search or rely on modeling relevance, and is applied to any search engine for which the rank of the same document may naturally change over time for the same query.

Reaching the End of Unbiasedness: Uncovering Implicit Limitations of Click-Based Learning to Rank

The inverted approach reveals that there are indeed implicit limitations to the counterfactual LTR approach: it is impossible for existing approaches to provide unbiasedness guarantees for all plausible click behavior models.

Adapting Interactional Observation Embedding for Counterfactual Learning to Rank

This work uses the embedding method to develop an Interactional Observation-Based Model (IOBM) and argues that while there exist complex observed and unobserved confounders for observation/click interactions, it is sufficient to use theembedding as a proxy confounder to uncover the relevant information for the prediction of the observation propensity.

Propensity-Independent Bias Recovery in Offline Learning-to-Rank Systems

A new counterfactual method is proposed that uses a two-stage correction approach and jointly addresses selection and position bias in learning-to-rank systems without relying on propensity scores, and is better than state-of-the-art propensity-independent methods and either better than or comparable to methods that make the strong assumption for which the propensity model is known.



Effective Evaluation Using Logged Bandit Feedback from Multiple Loggers

This paper finds that the standard Inverse Propensity Score (IPS) estimator suffers especially when logging and target policies diverge -- to a point where throwing away data improves the variance of the estimator.

Position Bias Estimation for Unbiased Learning to Rank in Personal Search

This paper proposes a regression-based Expectation-Maximization (EM) algorithm that is based on a position bias click model and that can handle highly sparse clicks in personal search and compares the pointwise and pairwise learning-to-rank models.

Unbiased Learning-to-Rank with Biased Feedback

A counterfactual inference framework is presented that provides the theoretical basis for unbiased LTR via Empirical Risk Minimization despite biased data, and a Propensity-Weighted Ranking SVM is derived for discriminative learning from implicit feedback, where click models take the role of the propensity estimator.

The Self-Normalized Estimator for Counterfactual Learning

This paper identifies a severe problem of the counterfactual risk estimator typically used in batch learning from logged bandit feedback (BLBF), and proposes the use of an alternative estimator that

Unbiased Learning to Rank with Unbiased Propensity Estimation

DLA is an automatic unbiased learning-to-rank framework as it directly learns unbiased ranking models from biased click data without any preprocessing and it can adapt to the change of bias distributions and is applicable to online learning.

Counterfactual Learning-to-Rank for Additive Metrics and Deep Models

This work generalizes the counterfactual learning-to-rank approach to a broad class of additive rank metrics -- like Discounted Cumulative Gain (DCG) and Precision@k -- as well as non-linear deep network models, and develops two new learning methods that both directly optimize an unbiased estimate of DCG despite the bias in the implicit feedback data.

An experimental comparison of click position-bias models

A cascade model, where users view results from top to bottom and leave as soon as they see a worthwhile document, is the best explanation for position bias in early ranks.

Batch learning from logged bandit feedback through counterfactual risk minimization

The empirical results show that the CRM objective implemented in POEM provides improved robustness and generalization performance compared to the state-of-the-art, and a decomposition of the POEM objective that enables efficient stochastic gradient optimization is presented.

Offline Comparative Evaluation with Incremental, Minimally-Invasive Online Feedback

This work investigates the use of logged user interaction data---queries and clicks---for offline evaluation of new search systems in the context of counterfactual analysis and presents a methodology for incrementally logging interactions on previously-unseen documents for use in computation of an unbiased estimator of a new ranker's effectiveness.

Beyond position bias: examining result attractiveness as a source of presentation bias in clickthrough data

This study distinguishes itself from prior work by aiming to detect systematic biases in click behavior due to attractive summaries inflating perceived relevance, and shows substantial evidence of presentation bias in clicks towards results with more attractive titles.