• Publications
  • Influence
Agnostic active learning
TLDR
We state and analyze the first active learning algorithm that finds an @e-optimal hypothesis in any hypothesis class, when the underlying distribution has arbitrary forms of noise, for several settings considered before in the realizable case. Expand
  • 374
  • 32
Approximate clustering without the approximation
TLDR
Approximation algorithms for clustering points in metric spaces is a flourishing area of research, with much research effort spent on getting a better understanding of the approximation guarantees possible for many objective functions such as k-median, k-means, and min-sum clustering. Expand
  • 141
  • 24
  • PDF
Scalable Kernel Methods via Doubly Stochastic Gradients
TLDR
The general perception is that kernel methods are not scalable, so neural nets become the choice for large-scale nonlinear learning problems. Expand
  • 172
  • 21
  • PDF
Improved Guarantees for Learning via Similarity Functions
TLDR
We provide a new notion of a “good similarity function” that builds upon the previous definition of Balcan and Blum (2006) but improves on it in two important ways. Expand
  • 59
  • 21
  • PDF
Margin Based Active Learning
TLDR
We present a framework for margin based active learning of linear separators in the realizable case and in the noisy setting related to the Tsybakov small noise condition. Expand
  • 226
  • 18
  • PDF
The Power of Localization for Efficiently Learning Linear Separators with Noise
TLDR
We introduce a new approach for designing computationally efficient learning algorithms that are tolerant to noise, and we demonstrate its effectiveness by designing algorithms with improved noise tolerance guarantees for learning linear separators in the presence of malicious noise or adversarial label noise. Expand
  • 107
  • 18
  • PDF
On a theory of learning with similarity functions
TLDR
We develop an alternative, more general theory of learning with similarity functions (i.e., sufficient conditions for a similarity function to allow one to learn well) that does not require reference to implicit spaces, anddoes not require the function to be positive semi-definite (or even symmetric). Expand
  • 126
  • 17
  • PDF
Clustering under Perturbation Resilience
TLDR
We present an algorithm that can optimally cluster instances resilient to $(1 + \sqrt{2})$-factor perturbations, solving an open problem of Awasthi et al. Expand
  • 79
  • 17
  • PDF
Approximation algorithms and online mechanisms for item pricing
TLDR
We present approximation and online algorithms for a number of problems of pricing items for sale so as to maximize seller's revenue in an unlimited supply setting. Expand
  • 85
  • 16
  • PDF
Clustering with Interactive Feedback
TLDR
We introduce a query-based model in which users can provide feedback to a clustering algorithm in a natural way via split and merge requests. Expand
  • 85
  • 16
  • PDF