An approach that scales up kernel methods using a novel concept called "doubly stochastic functional gradients" based on the fact that many kernel methods can be expressed as convex optimization problems, which can readily scale kernel methods up to the regimes which are dominated by neural nets.Expand

The effectiveness of the framework for margin based active learning of linear separators both in the realizable case and in a specific noisy setting related to the Tsybakov small noise condition is analyzed.Expand

If any c-approximation to the given clustering objective φ is e-close to the target, then this paper shows that this guarantee can be achieved for any constant c > 1, and for the min-sum objective the authors can do this for any Constant c > 2.Expand

This work provides the first polynomial-time active learning algorithm for learning linear separators in the presence of malicious noise or adversarial label noise, and achieves a label complexity whose dependence on the error parameter ϵ is polylogarithmic (and thus exponentially better than that of any passive algorithm).Expand

A new notion of a “good similarity function” is provided that builds upon the previous definition of Balcan and Blum (2006) but improves on it in two important ways, and it is proved that for distribution-specific PAC learning, the new notion is strictly more powerful than the traditional notion ofA large-margin kernel.Expand

This work develops an alternative, more general theory of learning with similarity functions (i.e., sufficient conditions for a similarity function to allow one to learn well) that does not require reference to implicit spaces, anddoes not require the function to be positive semi-definite (or even symmetric).Expand

A much weaker "expansion" assumption on the underlying data distribution is proposed, that is proved to be sufficient for iterative co-training to succeed given appropriately strong PAC-learning algorithms on each feature set, and that to some extent is necessary as well.Expand

This paper presents an algorithm that can optimally cluster instances resilient to $(1 + \sqrt{2})$-factor perturbations, solving an open problem of Awasthi et al.Expand