Constant approximation for k-median and k-means with outliers via iterative rounding

@article{Krishnaswamy2018ConstantAF,
  title={Constant approximation for k-median and k-means with outliers via iterative rounding},
  author={Ravishankar Krishnaswamy and Shi Li and Sai Sandeep},
  journal={Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing},
  year={2018}
}
In this paper, we present a new iterative rounding framework for many clustering problems. Using this, we obtain an (α1 + є ≤ 7.081 + є)-approximation algorithm for k-median with outliers, greatly improving upon the large implicit constant approximation ratio of Chen. For k-means with outliers, we give an (α2+є ≤ 53.002 + є)-approximation, which is the first O(1)-approximation for this problem. The iterative algorithm framework is very versatile; we show how it can be used to give α1- and (α1… Expand
Structural Iterative Rounding for Generalized k-Median Problems
TLDR
Improved approximation algorithms for generalized $k-median with outliers and knapsack median are given, allowing richer constraint sets in the iterative rounding and taking advantage of the structure of the resulting extreme points. Expand
Greedy Sampling for Approximate Clustering in the Presence of Outliers
TLDR
This work shows that for k-means and k-center clustering, simple modifications to the well-studied greedy algorithms result in nearly identical guarantees, while additionally being robust to outliers. Expand
Improved Algorithms for Clustering with Outliers
TLDR
This paper gave the first PTAS for the k-median problem with outliers in Euclidean space R^d for possibly high m and d, and introduced a (6+epsilon)-approximation algorithm for general metric space with running time O(n(beta (1/ep silon)(k+m))^k) for some constant beta>1. Expand
Outliers Detection Is Not So Hard: Approximation Algorithms for Robust Clustering Problems Using Local Search Techniques
TLDR
A new technique to analyze the approximation ratio of local search algorithms for k-median/k-means problems by introducing an adapted cluster that can capture useful information about outliers in the local and the global optimal solution. Expand
Robust k-means++
TLDR
This work shows that using a mixture of D and uniform sampling, one can pick O(k/δ) candidate centers with the following guarantee: they contain some k centers that give O(1)-approximation to the optimal robust k-means solution while discarding at most δn more points than the outliers discarded by the optimal solution. Expand
An Improved Approximation Algorithm for the k-Means Problem with Penalties
The clustering problem has been paid lots of attention in various fields of compute science. However, in many applications, the existence of noisy data poses a big challenge for the clusteringExpand
Improved Approximation Algorithms for Individually Fair Clustering
TLDR
This work extends the framework of [Charikar et al., 2002, Swamy, 2016] and devise a 16-approximation algorithm for the facility location with lp-norm cost under matroid constraint which might be of an independent interest and suggests a reduction from an individually fair clustering to a clustering with a group fairness requirement proposed by Kleindessner et al. Expand
On Sampling Based Algorithms for k-Means
TLDR
Making use of a constant factor solution for the (classical or unconstrained) k-means problem, the results of Bhattacharya et al. are generalised and a constant pass, polylog-space streaming PTAS for either of the two problems is designed. Expand
Fault Tolerant Clustering with Outliers
TLDR
This work essentially reduces the Fault Tolerant Clustering with Outliers problem, to the corresponding (non Fault Tolerance) Clustered with outlier problem, for which constant approximations are known, and shows that it is bounded by O(1) for the k-center objective, whereas it is O(f) for k-median and sum of radii objectives. Expand
Is Simple Uniform Sampling Efficient for Center-Based Clustering With Outliers: When and Why?
TLDR
This is the first work that systematically studies the effectiveness of uniform sampling from both theoretical and experimental aspects, and introduces a “significance” criterion and proves that the performance of the framework depends on the significance degree of the given instance. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 49 REFERENCES
Approximating k-median via pseudo-approximation
We present a novel approximation algorithm for k-median that achieves an approximation guarantee of 1+√3+ε, improving upon the decade-old ratio of 3+ε. Our approach is based on two components, eachExpand
Better Guarantees for k-Means and Euclidean k-Median by Primal-Dual Algorithms
TLDR
A new primal-dual approach is presented that allows to exploit the geometric structure of k-means and to satisfy the hard constraint that at most k clusters are selected without deteriorating the approximation guarantee. Expand
A Dependent LP-Rounding Approach for the k-Median Problem
TLDR
This paper revisits the classical k-median problem and gives an efficient algorithm to construct a probability distribution on sets of k centers that matches the marginals specified by the optimal LP solution. Expand
A Constant-Factor Approximation Algorithm for the k-Median Problem
TLDR
This work presents the first constant-factor approximation algorithm for the metric k-median problem, and improves upon the best previously known result of O(log k log log log k), which was obtained by refining and derandomizing a randomized O( log n log log n)-approximation algorithm of Bartal. Expand
A constant-factor approximation algorithm for the k-median problem (extended abstract)
TLDR
This work presents the first constant-factor approximation algorithm for the metric k-median problem, a polynomial-time algorithm that finds a feasible solution of objective function value within a factor of 6 of the optimum, and gives constant factor approximation algorithms for several natural extensions of the problem. Expand
An Improved Approximation for k-Median and Positive Correlation in Budgeted Optimization
TLDR
This work improves upon Li-Svensson’s approximation ratio for k-median by developing an algorithm that improves upon various aspects of their work and develops algorithms that guarantee the known properties of dependent rounding but also have nearly bestpossible behavior—near-independence, which generalizes positive correlation—on “small” subsets of the variables. Expand
Local Search Methods for k-Means with Outliers
TLDR
This work proposes a simple local search-based algorithm for k-means clustering with outliers and proves that this algorithm achieves constant-factor approximate solutions and can be combined with known sketching techniques to scale to large data sets. Expand
Approximation schemes for Euclidean k-medians and related problems
TLDR
An approximation scheme for the plane that for any c > 0 produces a solution of cost at most 1+ 1/c times the optimum and runs in time O(n) and generalizes to some problems related to k-median. Expand
Improved Approximation Algorithms for Matroid and Knapsack Median Problems and Applications
TLDR
A variety of seemingly disparate facility-location problems considered in the literature—data placement problem, mobile facility location, k-median forest, metric uniform minimum-latency Uncapacitated Facility Location (UFL)—in fact reduce to the matroid median or two-matroid median problems, and thus obtain improved approximation guarantees for all these problems. Expand
Clustering under approximation stability
TLDR
It is shown that for any constant c > 1, (c,ε)-approximation-stability of k-median or k-means objectives can be used to efficiently produce a clustering of error O(ε) with respect to the target clustering, as can stability of the min-sum objective if the target clusters are sufficiently large. Expand
...
1
2
3
4
5
...