Fully-Dynamic Coresets

@inproceedings{Henzinger2020FullyDynamicC,
  title={Fully-Dynamic Coresets},
  author={Monika Henzinger and Sagar Kale},
  booktitle={ESA},
  year={2020}
}
With input sizes becoming massive, coresets -- small yet representative summary of the input -- are relevant more than ever. A weighted set $C_w$ that is a subset of the input is an $\varepsilon$-coreset if the cost of any feasible solution $S$ with respect to $C_w$ is within $[1 {\pm} \varepsilon]$ of the cost of $S$ with respect to the original input. We give a very general technique to compute coresets in the fully-dynamic setting where input points can be added or deleted. Given a static… Expand
2 Citations
Coresets for Clustering with Missing Values
TLDR
The first coreset for clustering points in R that have multiple missing values (coordinates) is provided, which exhibits a flexible tradeoff between coreset size and accuracy, and generally outperforms the uniformsampling baseline. Expand
Robust Coreset for Continuous-and-Bounded Learning (with Outliers)
TLDR
This paper proposes a novel robust coreset method for the continuous-and-bounded learning problem (with outliers) which includes a broad range of popular optimization objectives in machine learning, like logistic regression and k-means clustering and can be efficiently maintained in fully-dynamic environment. Expand

References

SHOWING 1-10 OF 27 REFERENCES
New Frameworks for Offline and Streaming Coreset Constructions
TLDR
This work introduces a new technique for converting an offline coreset construction to the streaming setting, and provides the first generalizations of such coresets for handling outliers. Expand
Clustering High Dimensional Dynamic Data Streams
TLDR
Data streaming algorithms for the k-median problem in high-dimensional dynamic geometric data streams that guarantee only positive weights in the coreset with additional logarithmic factors in the space and time complexities are presented. Expand
On Coresets for k-Median and k-Means Clustering in Metric and Euclidean Spaces and Their Applications
  • K. Chen
  • Mathematics, Computer Science
  • SIAM J. Comput.
  • 2009
TLDR
These are the first streaming algorithms, for those problems, that have space complexity with polynomial dependency on the dimension, using $O(d^2k^2\varepsilon^{-2}\log^8n)$ space. Expand
Approximating k-Median via Pseudo-Approximation
TLDR
A novel approximation algorithm for $k-median is presented that achieves an approximation guarantee of $1+\sqrt{3}+\epsilon$, improving upon the decade-old ratio of $3+\ epsilon$ by exploiting the power of pseudo-approximation. Expand
Coresets in dynamic geometric data streams
TLDR
This work develops streaming (1 + ε)-approximation algorithms for k-median, k-means, MaxCut, maximum weighted matching (MaxWM), maximum travelling salesperson, maximum spanning tree, and average distance over dynamic geometric data streams. Expand
On coresets for k-means and k-median clustering
TLDR
This paper shows the existence of small coresets for the problems of computing k-median/means clustering for points in low dimension, and improves the fastest known algorithms for (1+ε)-approximate k-means and k- median. Expand
Improved Combinatorial Algorithms for Facility Location Problems
TLDR
Improved combinatorial approximation algorithms for the uncapacitated facility location problem and a variant of the capacitated facility locations problem is considered and improved approximation algorithms are presented for this. Expand
Fully Dynamic Consistent Facility Location
TLDR
The cost of the solution maintained by the algorithm at any time is very close to the cost of a solution obtained by quickly recomputing a solution from scratch at time $t$ while having a much better running time. Expand
An Improved Approximation for k-Median and Positive Correlation in Budgeted Optimization
TLDR
This work improves upon Li-Svensson’s approximation ratio for k-median by developing an algorithm that improves upon various aspects of their work and develops algorithms that guarantee the known properties of dependent rounding but also have nearly bestpossible behavior—near-independence, which generalizes positive correlation—on “small” subsets of the variables. Expand
Unifying and Strengthening Hardness for Dynamic Problems via the Online Matrix-Vector Multiplication Conjecture
TLDR
It is shown that a conjecture that there is no truly subcubic (O(n3-ε) time algorithm for this problem can be used to exhibit the underlying polynomial time hardness shared by many dynamic problems. Expand
...
1
2
3
...