Evaluating Stochastic Rankings with Expected Exposure

@article{Diaz2020EvaluatingSR,
  title={Evaluating Stochastic Rankings with Expected Exposure},
  author={Fernando Diaz and Bhaskar Mitra and Michael D. Ekstrand and Asia J. Biega and Ben Carterette},
  journal={Proceedings of the 29th ACM International Conference on Information \& Knowledge Management},
  year={2020}
}
We introduce the concept of expected exposure as the average attention ranked items receive from users over repeated samples of the same query. Furthermore, we advocate for the adoption of the principle of equal expected exposure: given a fixed information need, no item should receive more or less expected exposure than any other item of the same relevance grade. We argue that this principle is desirable for many retrieval objectives and scenarios, including topical diversity and fair ranking… Expand
Comparing Fair Ranking Metrics
TLDR
This work provides a direct comparative analysis identifying similarities and differences of fair ranking metrics selected for the work, and empirically compare them on the same experimental setup and data set. Expand
Neural methods for effective, efficient, and exposure-aware information retrieval
TLDR
This thesis presents novel neural architectures and methods motivated by the specific needs and challenges of IR tasks, and develops a framework to incorporate query term independence into any arbitrary deep model that enables large-scale precomputation and the use of inverted index for fast retrieval. Expand
Top-K Contextual Bandits with Equity of Exposure
The contextual bandit paradigm provides a general framework for decision-making under uncertainty. It is theoretically well-defined and well-studied, and many personalisation use-cases can be cast asExpand
FAIR: Fairness-Aware Information Retrieval Evaluation
TLDR
This work proposes a new metric called Fairness-Aware IR (FAIR), and develops an effective ranking algorithm that jointly optimized user utility and fairness and showed how FAIR related to existing metrics and demonstrated the effectiveness of the FAIR-based algorithm. Expand
Fairness in Ranking under Uncertainty
TLDR
It is shown how to compute rankings that optimally trade off approximate fairness against utility to the principal and an empirical analysis of the potential impact of the approach in simulation studies is presented. Expand
Incentives for Item Duplication Under Fair Ranking Policies
TLDR
This work studies the behaviour of different fair ranking policies in the presence of duplicates, finding that fairness-aware ranking policies may conflict with diversity, due to their potential to incentivize duplication more than policies solely focused on relevance. Expand
Measuring Group Advantage: A Comparative Study of Fair Ranking Metrics
TLDR
It is proved that under reasonable assumptions, popular metrics in the literature exhibit the same behavior and that optimizing for one optimizes for all, and a practical statistical test is designed to identify whether observed data is likely to exhibit predictable group bias. Expand
Naver Labs Europe at TREC 2020 Fair Ranking Track
TLDR
This paper describes the components of a controller investigated as a way to minimize unfairness over time, with minimal loss of utility, using a two-step approach, based on a relevance probability estimator and a controller that aims to bring the actual exposure close to the target exposure. Expand
Fairness and Discrimination in Information Access Systems
TLDR
This monograph presents a taxonomy of the various dimensions of fair information access and survey the literature to date on this new and rapidly-growing topic. Expand
Estimation of Fair Ranking Metrics with Incomplete Judgments
TLDR
This work proposes a robust and unbiased estimator which can operate even with very limited number of labeled items and provides a robust, reliable alternative to exhaustive or random data annotation. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 65 REFERENCES
Expected reciprocal rank for graded relevance
TLDR
This work presents a new editorial metric for graded relevance which overcomes this difficulty and implicitly discounts documents which are shown below very relevant documents and calls it Expected Reciprocal Rank (ERR). Expand
Quantifying the Impact of User Attentionon Fair Group Representation in Ranked Lists
TLDR
This work introduces a novel metric for auditing group fairness in ranked lists, and shows that determining fairness of a ranked output necessitates knowledge (or a model) of the end-users of the particular service. Expand
Evaluating diversified search results using per-intent graded relevance
TLDR
This work compares a wide range of traditional and diversified IR metrics after adding graded relevance assessments to the TREC 2009 Web track diversity task test collection, and shows that a family of metrics called D#-measures have several advantages over other metrics such as α-nDCG and Intent-Aware metrics. Expand
Learning to Rank with Selection Bias in Personal Search
TLDR
It is empirically demonstrate that learning-to-rank that accounts for query-dependent selection bias yields significant improvements in search effectiveness through online experiments with one of the world's largest personal search engines. Expand
Risky business: modeling and exploiting uncertainty in information retrieval
TLDR
A general framework for modeling uncertainty is presented and an asymmetric loss function with a single parameter that can model the level of risk the system is willing to accept is introduced, which can effectively adapt to users' different retrieval strategies. Expand
Rank-biased precision for measurement of retrieval effectiveness
TLDR
A new effectiveness metric, rank-biased precision, is introduced that is derived from a simple model of user behavior, is robust if answer rankings are extended to greater depths, and allows accurate quantification of experimental uncertainty, even when only partial relevance judgments are available. Expand
Ranking with Fairness Constraints
TLDR
This work studies the following variant of the traditional ranking problem when the objective satisfies properties that appear in common ranking metrics such as Discounted Cumulative Gain, Spearman's rho or Bradley-Terry. Expand
Shuffling a Stacked Deck: The Case for Partially Randomized Ranking of Search Engine Results
TLDR
It is shown that a modest amount of randomness leads to improved search results, in the context of an economic objective function based on aggregate result quality amortized over time. Expand
BPR: Bayesian Personalized Ranking from Implicit Feedback
TLDR
This paper presents a generic optimization criterion BPR-Opt for personalized ranking that is the maximum posterior estimator derived from a Bayesian analysis of the problem and provides a generic learning algorithm for optimizing models with respect to B PR-Opt. Expand
A Stochastic Treatment of Learning to Rank Scoring Functions
TLDR
This work analytically studies the proposed sampling method and demonstrates when and why it leads to model robustness, and empirically shows that the application of the proposed method to a class of ranking loss functions leads to significant model quality improvements. Expand
...
1
2
3
4
5
...