• Corpus ID: 235659007

Improved Approximation Algorithms for Individually Fair Clustering

  title={Improved Approximation Algorithms for Individually Fair Clustering},
  author={Ali Vakilian and Mustafa Yalçiner},
We consider the k -clustering problem with (cid:96) p -norm cost, which includes k -median, k means and k -center, under an individual notion of fairness proposed by Jung et al. [2020]: given a set of points P of size n , a set of k centers induces a fair clustering if every point in P has a center among its n/k closest neighbors. Mahabadi and Vakilian [2020] presented a ( p O ( p ) , 7)-bicriteria approximation for fair clustering with (cid:96) p norm cost: every point finds a center within… 

Figures and Tables from this paper

An Overview of Fairness in Clustering

This survey aims to provide researchers with an organized overview of the field, and motivate new and unexplored lines of research regarding fairness in clustering, and to bridge the gap by categorizing existing research on fair clustering.

Taxonomy of Fairness Concepts in Clustering

  • Computer Science
  • 2022
All different concepts of demographic fairness that have been studied in the context of clustering are presented.

Bicriteria Approximation Algorithms for Priority Matroid Median

Fairness considerations have motivated new clustering problems and algorithms in recent years. In this paper we consider the Priority Matroid Median problem which generalizes the Priority k -Median

FAL-CUR: Fair Active Learning using Uncertainty and Representativeness on Fair Clustering

A novel active learning strategy called Fair Active Learning using fair Clustering, Uncertainty, and Representativeness (FAL-CUR) that provides a high accuracy while maintaining fairness during the sample acquisition phase and outperforms state-of-the-art methods on well-known fair active learning problems.

Modification-Fair Cluster Editing

A modification fairness constraint is proposed which ensures that the number of edits incident to each subgroup is proportional to its size, and results show that the problem is NP-hard even if one may only insert edges within a subgroup.

Individual Preference Stability for Clustering

It is shown that deciding whether a given data set allows for an IP-stable clustering in general is NP-hard, and the design of efficient algorithms for ef-stable clusterings in some restricted metric spaces are explored.

Approximation Algorithms for Continuous Clustering and Facility Location Problems

It is shown that, for the continuous versions of some clustering problems, one can design approximation algorithms attaining a better factor than the β -factor blow-up mentioned above, and this technique based on the round-or-cut framework is described.

Constant-Factor Approximation Algorithms for Socially Fair k-Clustering

The performance of these algorithms are compared with existing bicriteria algorithms as well as exactly k center approximation algorithms on benchmark datasets, and it is found that these algorithms outperform existing methods in practice.

Measuring and mitigating voting access disparities: a study of race and polling locations in Florida and North Carolina

Voter suppression and associated racial disparities in access to voting are long-standing civil rights concerns in the United States. Barriers to voting have taken many forms over the decades. A

On Coresets for Fair Regression and Individually Fair Clustering

This paper defines coresets for Fair Regression with Statistical Parity (SP) constraints and for Individually Fair Clustering and shows that to obtain such coresets, it is sufficient to sample points based on the probabilities dependent on combination of sensitivity score and a carefully chosen term according to the fairness constraints.



Fair k-Centers via Maximum Matching

This paper combines the best of each algorithm by presenting a linear-time algorithm with a guaranteed 3-approximation factor and provides empirical evidence of both the algorithm’s runtime and effectiveness.

Constant approximation for k-median and k-means with outliers via iterative rounding

A new iterative rounding framework for many clustering problems is presented, and an α1- and (α1 + є)-approximation algorithms for matroid and knapsack median problems respectively are given, improving upon the previous best approximations ratios.

Fair Clustering via Equitable Group Representations

It is demonstrated how group representative k-median clustering notions are distinct from and cannot be captured by balance-based notions of fairness, as well as an empirical evaluation on various real-world data sets.

(Individual) Fairness for k-Clustering

The $k-median ($k-means) cost of the solution is within a constant factor of the cost of an optimal fair $k$-clustering, and the solution approximately satisfies the fairness condition.

A Center in Your Neighborhood: Fairness in Facility Location

A fairness concept is formulated that takes local population densities into account and gives an approximation algorithm that guarantees a factor of at most 2 in all metric spaces; this algorithm is applied to real-world address data and proves matching lower bounds in some metric spaces.

Fair k-Center Clustering for Data Summarization

This paper provides a simple approximation algorithm for the $k$-center problem under the fairness constraint with running time linear in the size of the data set and $k$.

The matroid median problem

A constant factor approximation algorithm is given for the matroid median problem and it is shown that this second phase LP is in fact integral; the integrality proof is based on a connection to matroid intersection.

Better Algorithms for Individually Fair k-Clustering

It is proved that by modifying known LP rounding techniques, one gets a worst-case guarantee on the objective which is much better than in MV20, and empirically, this objective is extremely close to the optimal.

Auditing for Discrimination in Algorithms Delivering Job Ads

A new methodology for black-box auditing of algorithms for discrimination in the delivery of job advertisements, and develops an auditing methodology that distinguishes between skew explainable by differences in qualifications from other factors, such as the ad platform’s optimization for engagement or training its algorithms on biased data.

Socially Fair k-Means Clustering

It is found that on benchmark datasets, Fair-Lloyd exhibits unbiased performance by ensuring that all groups have equal costs in the output k-clustering, while incurring a negligible increase in running time, thus making it a viable fair option wherever k-means is currently used.