An experimental comparison of click position-bias models

  title={An experimental comparison of click position-bias models},
  author={Nick Craswell and Onno Zoeter and Michael J. Taylor and Bill Ramsey},
  booktitle={WSDM '08},
Search engine click logs provide an invaluable source of relevance information, but this information is biased. A key source of bias is presentation order: the probability of click is influenced by a document's position in the results page. This paper focuses on explaining that bias, modelling how probability of click depends on position. We propose four simple hypotheses about how position bias might arise. We carry out a large data-gathering effort, where we perturb the ranking of a major… 

Figures and Tables from this paper

Revisiting the Examination Hypothesis with Query Specific Position Bias

A model for analyzing a query specific position bias from the click data is presented and this model consistently outperforms both EH and UBM on well-used measures such as relative error and cross entropy.

A user browsing model to predict search engine click data from past observations.

It is confirmed that a user almost always see the document directly after a clicked document, and why documents situated just after a very relevant document are clicked more often is explained.

Temporal click model for sponsored search

This is the first attempt in the literature to estimate positional bias, externalities and unbiased user-perceived ad quality from user click logs in a combined model and shows that TCM outperforms two other competitive methods at click prediction.

Beyond position bias: examining result attractiveness as a source of presentation bias in clickthrough data

This study distinguishes itself from prior work by aiming to detect systematic biases in click behavior due to attractive summaries inflating perceived relevance, and shows substantial evidence of presentation bias in clicks towards results with more attractive titles.

Estimating Clickthrough Bias in the Cascade Model

This work shows that the existing counterfactual estimators fail to capture one type of bias, specifically, the effect on click-through rates due to the relevance of documents ranked above, and proposes a modification to the existing estimator that takes into account this bias.

Characterizing search intent diversity into click models

A new intent hypothesis is proposed as a complement to the examination hypothesis and is used to characterize the bias between the user search intent and the query in each search session.

A noise-aware click model for web search

A Noise-aware Click Model (NCM) is proposed by characterizing the noise degree of a click, which indicates the quality of the click for inferring relevance, and shows that the lower the click noise is, the more important the click is in its role for relevance inference.

Constructing click models for search users

To understand if and how much a user click on a result document implies true relevance, one has to take into account different factors (usually named behavior biases), in addition to the factor of relevance, that may affect user click behaviors.

Position Bias Estimation for Unbiased Learning to Rank in Personal Search

This paper proposes a regression-based Expectation-Maximization (EM) algorithm that is based on a position bias click model and that can handle highly sparse clicks in personal search and compares the pointwise and pairwise learning-to-rank models.

Efficient multiple-click models in web search

This paper presents two multiple-click models: the independent click model which is reformulated from previous work, and the dependent click model (DCM) which takes into consideration dependencies between multiple clicks.



Predicting clicks: estimating the click-through rate for new ads

This work shows that it can be used to use features of ads, terms, and advertisers to learn a model that accurately predicts the click-though rate for new ads, and shows that using this model improves the convergence and performance of an advertising system.

Learning user interaction models for predicting web search result preferences

This work presents a real-world study of modeling the behavior of web search users to predict web search result preferences and generalizes the approach to model user behavior beyond clickthrough, which results in higher preference prediction accuracy than models based on clickthrough information alone.

A Statistical Model of Query Log Generation

It is shown that it is possible to quantify this influence of a user’s click and consequently estimate document “un-biased” popularities within a search engine list of results.

Web Search Engine Evaluation Using Clickthrough Data and a User Model

A toy model is illustrated with a toy model that once the user behavior is agreed upon, the human assessment can be eliminated and the engine performance can be evaluated based on the clickthrough data of past users.

Improving search engines by query clustering

A framework for clustering Web search engine queries whose aim is to identify groups of queries used to search for similar information on the Web is presented, based on a novel term vector model of queries that integrates user selections and the content of selected documents extracted from the logs of a search engine.

Accurately interpreting clickthrough data as implicit feedback

It is concluded that clicks are informative but biased, and while this makes the interpretation of clicks as absolute relevance judgments difficult, it is shown that relative preferences derived from clicks are reasonably accurate on average.

Optimizing search engines using clickthrough data

The goal of this paper is to develop a method that utilizes clickthrough data for training, namely the query-log of the search engine in connection with the log of links the users clicked on in the presented ranking.

Minimally Invasive Randomization fro Collecting Unbiased Preferences from Clickthrough Logs

A simple method is introduced to modify the presentation of search results that provably gives relevance judgments that are unaffected by presentation bias under reasonable assumptions and can be guaranteed to converge to an ideal ranking given sufficient data.

Shuffling a Stacked Deck: The Case for Partially Randomized Ranking of Search Engine Results

It is shown that a modest amount of randomness leads to improved search results, in the context of an economic objective function based on aggregate result quality amortized over time.