• Corpus ID: 12408253

# Learning-Theoretic Foundations of Algorithm Configuration for Combinatorial Partitioning Problems

@inproceedings{Balcan2016LearningTheoreticFO,
title={Learning-Theoretic Foundations of Algorithm Configuration for Combinatorial Partitioning Problems},
author={Maria-Florina Balcan and Vaishnavh Nagarajan and Ellen Vitercik and Colin White},
booktitle={Annual Conference Computational Learning Theory},
year={2016}
}
• Published in
Annual Conference…
14 November 2016
• Computer Science
Max-cut, clustering, and many other partitioning problems that are of significant importance to machine learning and other scientific fields are NP-hard, a reality that has motivated researchers to develop a wealth of approximation algorithms and heuristics. [] Key Method Our algorithms learn over common integer quadratic programming and clustering algorithm families: SDP rounding algorithms and agglomerative clustering algorithms with dynamic programming. For our sample complexity analysis, we provide tight…

## Figures and Tables from this paper

• M. Balcan
• Computer Science
Beyond the Worst-Case Analysis of Algorithms
• 2020
This chapter surveys recent work that helps put data-driven combinatorial algorithm design on firm foundations and provides strong computational and statistical performance guarantees, both for the batch and online scenarios where a collection of typical problem instances from the given application are presented either all at once or in an online fashion.
• Computer Science
NeurIPS
• 2018
An infinite family of algorithms generalizing Lloyd's algorithm is defined, which includes the celebrated k-means++ algorithm, as well as the classic farthest-first traversal algorithm.
• Computer Science
ICML
• 2018
It is shown how to use machine learning to determine an optimal weighting of any set of partitioning procedures for the instance distribution at hand using samples from the distribution, and it is proved that this reduction can even be exponential.
• Computer Science
ArXiv
• 2022
This work provides algorithms for efficient (output-polynomial) multidimensional parameter tuning, i.e. for families with a small constant number of parameters, for three very different combinatorial problems — linkage-based clustering, dynamic programming for sequence alignment, and auction design for two-part tariff schemes.
• A. Blum
• Computer Science
Commun. ACM
• 2020
This work identifies a new notion called dispersion that enables positive results in principled data-driven algorithm design and hyperparameters for many popular clustering methods.
• Computer Science
2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS)
• 2018
This work provides upper and lower bounds on regret for algorithm selection in online settings, and presents general techniques for optimizing the sum or average of piecewise Lipschitz functions when the underlying functions satisfy a sufficient and general condition called dispersion.
• Computer Science
ICLR
• 2020
This work designs efficient learning algorithms which receive samples from an application-specific distribution over clustering instances and simultaneously learn both a near-optimal distance and clustering algorithm from these classes, and carries out a comprehensive empirical evaluation of these techniques.
• Computer Science
ArXiv
• 2021
This paper proves sample complexity guarantees for this procedure, which bound how large the training set should be to ensure that for any configuration, its average performance over theTraining set is close to its expected future performance.
• Computer Science
NeurIPS
• 2019
A new algorithm is introduced that preserves the near-optimality and anytime properties of Structured Procrastination while adding adaptivity, and will perform dramatically faster in settings where many algorithm configurations perform poorly.
This thesis designs efficient algorithms that output optimal or near-optimal clusterings for the canonical k-center objective under perturbation resilience, and proposes data-dependent dispatching algorithms which cast the problem as clustering with important balance and fault-tolerance conditions.

## References

SHOWING 1-10 OF 45 REFERENCES

• Computer Science
• 2011
This book shows how to design approximation algorithms: efficient algorithms that find provably near-optimal solutions to discrete optimization problems.
• Computer Science
JACM
• 2009
The use of supervised machine learning is proposed to build models that predict an algorithm's runtime given a problem instance and techniques for interpreting them are described to gain understanding of the characteristics that cause instances to be hard or easy.
• Computer Science
JACM
• 1995
This algorithm gives the first substantial progress in approximating MAX CUT in nearly twenty years, and represents the first use of semidefinite programming in the design of approximation algorithms.
• Computer Science
SIAM J. Comput.
• 2016
This paper presents an algorithm that can optimally cluster instances resilient to $(1 + \sqrt{2})$-factor perturbations, solving an open problem of Awasthi et al.
• Computer Science
CP
• 1999
A combination of meta-heuristics that yields new best-known results on the Solomon benchmarks are demonstrated, and a method to automatically adjust this combination to handle problems with different sizes, complexity and optimization objectives is provided.
• Computer Science
Algorithmica
• 2017
It is proved that the complete-linkage method computes an O(1)-approximation for this problem for any metric that is induced by a norm, assuming that the dimension d is a constant.
• Computer Science
ArXiv
• 2008
A proof of the \$k-median result which avoids the coupling'' argument and can be used in other settings where the Arya et al. arguments have been used.
• Computer Science
ICALP
• 2016
This work provides strong positive results both for the asymmetric and symmetric k-center problems under a natural input stability (promise) condition called α-perturbation resilience and provides algorithms that give strong guarantees simultaneously for stable and non-stable instances.
• Computer Science
STOC '97
• 1997
This work considers the problem of clustering dynamic point sets in a metric space and proposes a model called incremental clustering which is based on a careful analysis of the requirements of the information retrieval application, and which should also be useful in other applications.