Estimation of Fair Ranking Metrics with Incomplete Judgments

@article{Kirnap2021EstimationOF,
  title={Estimation of Fair Ranking Metrics with Incomplete Judgments},
  author={{\"O}mer Kirnap and Fernando Diaz and Asia J. Biega and Michael D. Ekstrand and Ben Carterette and Emine Yilmaz},
  journal={Proceedings of the Web Conference 2021},
  year={2021}
}
There is increasing attention to evaluating the fairness of search system ranking decisions. These metrics often consider the membership of items to particular groups, often identified using protected attributes such as gender or ethnicity. To date, these metrics typically assume the availability and completeness of protected attribute labels of items. However, the protected attributes of individuals are rarely present, limiting the application of fair ranking metrics in large scale systems. In… 

Figures and Tables from this paper

Comparing Fair Ranking Metrics

TLDR
This work provides a direct comparative analysis identifying similarities and differences of fair ranking metrics selected for the work, and empirically compare them on the same experimental setup and data set.

Measuring Fairness of Rankings under Noisy Sensitive Information

TLDR
This work investigates the problem of measuring group fairness in ranking for a suite of divergence-based metrics in the presence of proxy labels and shows that under certain assumptions, fairness of a ranking can be measured from the proxy labels.

Measuring Fairness in Ranked Results: An Analytical and Empirical Comparison

TLDR
This paper describes several fair ranking metrics from the existing literature in a common notation, enabling direct comparison of their approaches and assumptions, and empirically compare them on the same experimental setup and data sets in the context of three information access tasks.

A Versatile Framework for Evaluating Ranked Lists in terms of Group Fairness and Relevance

TLDR
A simple and versatile framework for evaluating ranked lists in terms of group fairness and relevance, where the groups can be either nominal or ordinal in nature, and can quantify intersectional group fairness based on multiple attribute sets is presented.

CPFair: Personalized Consumer and Producer Fairness Re-ranking for Recommender Systems

TLDR
This work presents an optimization-based re-ranking approach that seamlessly integrates fairness constraints from both the consumer and producer-side in a joint objective framework, and demonstrates the role algorithms may play in minimizing data biases.

Measuring Fairness under Unawareness of Sensitive Attributes: A Quantification-Based Approach

TLDR
This work tackles the problem of measuring group fairness under unawareness of sensitive attributes, by using techniques from quantification, a supervised learning task concerned with directly providing group-level prevalence estimates (rather than individual-level class labels), and shows that quantification approaches are particularly suited to tackle the fairness-under-unawareness problem.

A Survey on the Fairness of Recommender Systems

TLDR
This survey reviews over 60 papers published in top conferences/journals and provides an elaborate taxonomy of fairness methods in the recommendation, and outlines some promising future directions on fairness in recommendation.

A Survey of Research on Fair Recommender Systems

TLDR
It is found that in many research works in computer science very abstract problem operationalizations are prevalent, which circumvent the fundamental and important question of what represents a fair recommendation in the context of a given application.

FairRoad: Achieving Fairness for Recommender Systems with Optimized Antidote Data

TLDR
This paper proposes a new approach called fair recommendation with optimized antidote data (FairRoad), which aims to improve the fairness performances of recommender systems through the construction of a small and carefully crafted antidote dataset.

Probabilistic Permutation Graph Search: Black-Box Optimization for Fairness in Ranking

TLDR
A novel way of representing permutation distributions, based on the notion of permutation graphs, is presented, which improves over~\acPL for optimizing fairness metrics for queries with one session and is suitable for both deterministic and stochastic rankings.

References

SHOWING 1-10 OF 43 REFERENCES

FARE: Diagnostics for Fair Ranking using Pairwise Error Metrics

TLDR
This work designs a fair auditing mechanism which captures group treatment throughout the entire ranking, generating in-depth yet nuanced diagnostics, and demonstrates the efficacy of the error metrics using real-world scenarios, exposing trade-offs among fairness criteria and providing guidance in the selection of fair-ranking algorithms.

Ranking with Fairness Constraints

TLDR
This work studies the following variant of the traditional ranking problem when the objective satisfies properties that appear in common ranking metrics such as Discounted Cumulative Gain, Spearman's rho or Bradley-Terry.

Evaluating Stochastic Rankings with Expected Exposure

TLDR
A general evaluation methodology based on expected exposure is proposed, allowing a system, in response to a query, to produce a distribution over rankings instead of a single fixed ranking.

Fairness of Exposure in Rankings

TLDR
This work proposes a conceptual and computational framework that allows the formulation of fairness constraints on rankings in terms of exposure allocation, and develops efficient algorithms for finding rankings that maximize the utility for the user while provably satisfying a specifiable notion of fairness.

Equity of Attention: Amortizing Individual Fairness in Rankings

TLDR
The challenge of achieving amortized individual fairness subject to constraints on ranking quality as an online optimization problem is formulated and solved as an integer linear program and it is demonstrated that the method can improve individual fairness while retaining high ranking quality.

FA*IR: A Fair Top-k Ranking Algorithm

TLDR
This work defines and solves the Fair Top-k Ranking problem, and presents an efficient algorithm, which is the first algorithm grounded in statistical tests that can mitigate biases in the representation of an under-represented group along a ranked list.

Fairness in Recommendation Ranking through Pairwise Comparisons

TLDR
This paper offers a set of novel metrics for evaluating algorithmic fairness concerns in recommender systems and shows how measuring fairness based on pairwise comparisons from randomized experiments provides a tractable means to reason about fairness in rankings fromRecommender systems.

Overview of the TREC 2019 Fair Ranking Track

TLDR
An overview of the TREC Fair Ranking track is presented, including the task definition, descriptions of the data and the annotation process, as well as a comparison of the performance of submitted systems.

Rank-biased precision for measurement of retrieval effectiveness

TLDR
A new effectiveness metric, rank-biased precision, is introduced that is derived from a simple model of user behavior, is robust if answer rankings are extended to greater depths, and allows accurate quantification of experimental uncertainty, even when only partial relevance judgments are available.

A statistical method for system evaluation using incomplete judgments

TLDR
This work considers the problem of large-scale retrieval evaluation, and proposes a statistical method for evaluating retrieval systems using incomplete judgments based on random sampling, which produces unbiased estimates of the standard measures themselves.