• Corpus ID: 13617337

Active and passive learning of linear separators under log-concave distributions

@article{Balcan2013ActiveAP,
  title={Active and passive learning of linear separators under log-concave distributions},
  author={Maria-Florina Balcan and Philip M. Long},
  journal={ArXiv},
  year={2013},
  volume={abs/1211.1082}
}
We provide new results concerning label efficient, polynomial time, passive and active learning of linear separators. We prove that active learning provides an exponential improvement over PAC (passive) learning of homogeneous linear separators under nearly log-concave distributions. Building on this, we provide a computationally efficient PAC algorithm with optimal (up to a constant factor) sample complexity for such problems. This resolves an open question concerning the sample complexity of… 
Minimax Analysis of Active Learning
TLDR
This work establishes distribution-free upper and lower bounds on the minimax label complexity of active learning with general hypothesis classes, under various noise models, and proposes new active learning strategies that nearly achieve these minimax labels complexities.
The Power of Comparisons for Actively Learning Linear Classifiers
TLDR
While previous negative results showed this model to have intractably large sample complexity for label queries, it is shown that comparison queries make RPU-learning at worst logarithmically more expensive in both the passive and active regimes.
Sample and Computationally Efficient Learning Algorithms under S-Concave Distributions
TLDR
New convex geometry tools are introduced to study the properties of $s-concave distributions and these properties are used to provide bounds on quantities of interest to learning including the probability of disagreement between two halfspaces, disagreement outside a band, and the disagreement coefficient.
Optimal learning via local entropies and sample compression
TLDR
This paper provides several novel upper bounds on the excess risk with a primal focus on classification problems and develops techniques that allow to replace empirical covering number or covering numbers with bracketing by the coverings by the distribution of the data.
Sample-Optimal PAC Learning of Halfspaces with Malicious Noise
TLDR
A novel incorporation of a matrix Chernoff-type inequality to bound the spectrum of an empirical covariance matrix for well-behaved distributions in conjunction with a careful exploration of the localization schemes of Awasthi et al. (2017) essentially achieves the near-optimal sample complexity bound of Õ(d).
Active Learning Polynomial Threshold Functions
TLDR
It is proved that access to derivatives is insufficient for active learning multivariate PTFs, even those of just two variables.
S-Concave Distributions: Towards Broader Distributions for Noise-Tolerant and Sample-Efficient Learning Algorithms
TLDR
New convex geometry tools to study the properties of s-concave distributions are introduced and these properties are used to provide bounds on quantities of interest to learning including the probability of disagreement between two halfspaces, disagreement outside a band, and disagreement coefficient.
Noise-Adaptive Margin-Based Active Learning and Lower Bounds under Tsybakov Noise Condition
TLDR
It is shown that the sample complexity cannot be improved even if the underly-ing data distribution is as simple as the uniform distribution on the unit ball, and lower bounds for margin based active learning algorithms under Tsybakov noise conditions (TNC) are derived.
Convergence Rates of Active Learning for Maximum Likelihood Estimation
TLDR
This paper provides an upper bound on the label requirement of the algorithm, and a lower bound that matches it up to lower order terms, and shows that unlike binary classification in the realizable case, just a single extra round of interaction is sufficient to achieve near-optimal performance in maximum likelihood estimation.
Beating the Minimax Rate of Active Learning with Prior Knowledge
TLDR
This is the first work that improves the minimax rate of active learning by utilizing certain priori knowledge and shows that the introduction of the convex surrogate loss yields an exponential reduction in the label complexity even when the parameter $\kappa$ of the Tsybakov noise is larger than $1$.
...
...

References

SHOWING 1-10 OF 71 REFERENCES
A General Agnostic Active Learning Algorithm
TLDR
This work presents an agnostic active learning algorithm for any hypothesis class of bounded VC dimension under arbitrary data distributions, using reductions to supervised learning that harness generalization bounds in a simple but subtle manner and provides a fall-back guarantee that bounds the algorithm's label complexity by the agnostic PAC sample complexity.
The true sample complexity of active learning
TLDR
It is proved that it is always possible to learn an ε-good classifier with a number of samples asymptotically smaller than this, which contrasts with the traditional analysis of active learning problems such as non-homogeneous linear separators or depth-limited decision trees, in which Ω(1/ε) lower bounds are common.
Active Learning for Smooth Problems
TLDR
It is shown that exponential improvements arise when the underlying learning problem is “smooth,” i.e., the hypothesis class, the instance space and the distribution can all be described by smooth functions.
Lower Bounds for Passive and Active Learning
TLDR
Unified information-theoretic machinery for deriving lower bounds for passive and active learning schemes is developed and first known lower bounds based on the capacity function rather than the disagreement coefficient are provided.
Agnostic active learning
TLDR
The first active learning algorithm which works in the presence of arbitrary forms of noise is state and analyzed, and it is shown that A2 achieves an exponential improvement over the usual sample complexity of supervised learning.
Minimax Bounds for Active Learning
TLDR
The achievable rates of classification error convergence for broad classes of distributions characterized by decision boundary regularity and noise conditions are studied using minimax analysis techniques to indicate the conditions under which one can expect significant gains through active learning.
Sampling and integration of near log-concave functions
TLDR
This work provides the first polynomial time algorithm to generate samples from a given log-concave distribution and proves a general isoperimetric inequality for convex sets and uses this together with recent developments in the theory of rapidly mixing Markov chains.
An Inequality for Nearly Log-Concave Distributions With Applications to Learning
We prove that given a nearly log-concave distribution, in any partition of the space to two well separated sets, the measure of the points that do not belong to these sets is large. We apply this
Analysis of Perceptron-Based Active Learning
TLDR
A simple selective sampling algorithm is presented, which combines a modification of the perceptron update with an adaptive filtering rule for deciding which points to query and reaches generalization error e after asking for just O(d log 1/∈) labels.
Learning noisy linear classifiers via adaptive and selective sampling
TLDR
Efficient margin-based algorithms for selective sampling and filtering in binary classification tasks and for α→∞ (hard margin condition) the gap between the semi- and fully-supervised rates becomes exponential.
...
...