• Corpus ID: 231918028

Fairness, Semi-Supervised Learning, and More: A General Framework for Clustering with Stochastic Pairwise Constraints

@article{Brubach2021FairnessSL,
  title={Fairness, Semi-Supervised Learning, and More: A General Framework for Clustering with Stochastic Pairwise Constraints},
  author={Brian Brubach and Darshan Chakrabarti and John P. Dickerson and Aravind Srinivasan and Leonidas Tsepenekas},
  journal={ArXiv},
  year={2021},
  volume={abs/2103.02013}
}
Metric clustering is fundamental in areas ranging from Com- binatorial Optimization and Data Mining, to Machine Learning and Operations Research. However, in a variety of situ- ations we may have additional requirements or knowledge, distinct from the underlying metric, regarding which pairs of points should be clustered together. To capture and ana-lyze such scenarios, we introduce a novel family of stochas- tic pairwise constraints , which we incorporate into several essential clustering… 

Tables from this paper

A New Notion of Individually Fair Clustering: $\alpha$-Equitable $k$-Center
TLDR
This work introduces a novel definition of fairness for clustering problems, and provides efficient and easily implementable approximation algorithms for the k-center objective, which in certain cases also enjoy bounded PoF guarantees.
A New Notion of Individually Fair Clustering: α-Equitable k-Center
TLDR
This work introduces a novel definition of individual fairness for clustering problems, and provides efficient and easily-implementable approximation algorithms for the k -center objective, which in certain cases also enjoy bounded-PoF guarantees.
Fair Clustering Under a Bounded Cost
TLDR
This paper considers two fairness objectives: the group utilitarian objective and the groupegal objective, as well as the group leximin objective which generalizes the group egalitarian objective, and derives fundamental lower bounds on the approximation of the utilitarian and egalitarian objectives and introduces algorithms with provable guarantees for them.
Semi-FairVAE: Semi-supervised Fair Representation Learning with Adversarial Variational Autoencoder
TLDR
A bias-aware model to capture inherent bias information on sensitive attributes by accurately predicting sensitive attributes from input data, and a bias-free model to learn debi-ased fair representations by using adversarial learning to remove bias information from them are used.

References

SHOWING 1-10 OF 46 REFERENCES
A Pairwise Fair and Community-preserving Approach to k-Center Clustering
TLDR
This work formally defines two new types of fairness in the clustering setting, pairwise fairness and community preservation, and devise an approach for extending existing $k$-center algorithms to satisfy these fairness constraints.
Coresets for Clustering with Fairness Constraints
TLDR
An approach to clustering with fairness constraints that involve multiple, non-disjoint types, that is also scalable and achieves a speed-up to recent fair clustering algorithms by incorporating the first known coreset construction for theFair clustering problem with thek-median objective.
Fair Algorithms for Clustering
TLDR
This work significantly generalizes the seminal work of Chierichetti this http URL and transforms any vanilla clustering solution into a fair one incurring only a slight loss in quality.
A SAT-based Framework for Efficient Constrained Clustering
TLDR
This paper shows how both instance and cluster-level constraints can be expressed as instances of the 2SAT problem and how multiple calls to a2SAT solver can be used to construct algorithms that are guaranteed to satisfy all the constraints and converge to a global optimum for a number of intuitive objective functions.
Probabilistic Fair Clustering
TLDR
This paper presents clustering algorithms in this more general setting with approximation ratio guarantees and addresses the problem of "metric membership", where different groups have a notion of order and distance.
Distributional Individual Fairness in Clustering
TLDR
This paper adopts the individual fairness notion, which mandates that similar individuals should be treated similarly for clustering problems, and introduces a framework for assigning individuals, embedded in a metric space, to probability distributions over a bounded number of cluster centers.
On the cost of essentially fair clusterings
TLDR
A relaxed fairness notion under which bicriteria constant-factor approximations for all of the classical clustering objectives are given, which can be established belatedly, in a situation where the centers are already fixed.
Fair Clustering Through Fairlets
TLDR
It is shown that any fair clustering problem can be decomposed into first finding good fairlets, and then using existing machinery for traditional clustering algorithms, and while finding goodFairlets can be NP-hard, they can be obtained by efficient approximation algorithms based on minimum cost flow.
An Improved Approximation for k-Median and Positive Correlation in Budgeted Optimization
TLDR
This work improves upon Li-Svensson’s approximation ratio for k-median by developing an algorithm that improves upon various aspects of their work and develops algorithms that guarantee the known properties of dependent rounding but also have nearly bestpossible behavior—near-independence, which generalizes positive correlation—on “small” subsets of the variables.
Clustering without Over-Representation
TLDR
This paper obtains an algorithm that has provable guarantees of performance and a simpler combinatorial algorithm for the special case of the problem where no color has an absolute majority in any cluster.
...
...