Efficient Active Learning with Boosting

@inproceedings{Wang2009EfficientAL,
  title={Efficient Active Learning with Boosting},
  author={Zheng Wang and Y. Song and Changshui Zhang},
  booktitle={SDM},
  year={2009}
}
This paper presents an active learning strategy for boosting. In this strategy, we construct a novel objective function to unify semi-supervised learning and active learning boosting. Minimization of this objective is achieved through alternating optimization with respect to the classifier ensemble and the queried data set iteratively. Previous semi-supervised learning or active learning methods based on boosting can be viewed as special cases under this framework. More important, we derive an… Expand
A survey on instance selection for active learning
TLDR
This survey intends to provide a high-level summarization for active learning and motivates interested readers to consider instance-selection approaches for designing effective active learning solutions. Expand
Design and Analysis of the Nomao challenge Active Learning in the Real-World
Active Learning is an active area of research in the Machine Learning and Data Mining communities. In parallel, needs for efficient active learning methods are raised in real-world applications. AsExpand
Active learning in the real-world design and analysis of the Nomao challenge
TLDR
This paper presents an active learning challenge applied to a real-world application named Nomao, a search engine of places that aggregates information coming from multiple sources on the web to propose complete information related to a place. Expand
An incremental online semi-supervised active learning algorithm based on self-organizing incremental neural network
TLDR
An incremental online semi-supervised active learning algorithm, which is based on a self-organizing incremental neural network (SOINN), is proposed, which can learn from both labeled and unlabeled samples and realize online incremental learning. Expand
Online Active Learning: Label Complexity vs. Classification Errors
TLDR
The proposed algorithm is shown to outperform extensions of representative offline algorithms developed under the PAC setting as well as online algorithms specialized for learning homogeneous linear separators. Expand
Learning better while sending less: Communication-efficient online semi-supervised learning in client-server settings
TLDR
Experimental results on real-world data sets show that this particular combination of techniques outperforms other approaches, and in particular, often outperforms (communication expensive) approaches that send all the data to the server. Expand
L G ] 2 4 O ct 2 01 6 Search Improves Label for Active Learning
We investigate active learning with access to two distinct oracles: Label (which is standard) and Search (which is not). The Search oracle models the situation where a human searches a database toExpand
Sample reusability in importance-weighted active learning
TLDR
It is argued that universal reusability is impossible: because every active learning strategy must undersample some areas of the sample space, classifiers that depend on the samples in those areas will learn more from a random sample selection. Expand
From Adversarial Learning to Reliable and Scalable Learning
TLDR
This dissertation provides an insight of threats that adversaries can mislead the decision of learning algorithms, and develops robust learning algorithms for security-sensitive applications. Expand
Towards Real-Time Polyp Detection in Colonoscopy Videos: Adapting Still Frame-Based Methodologies for Video Sequences Analysis
TLDR
A strategy to adapt real-time polyps detection methods to video analysis by adding a spatio-temporal stability module and studying a combination of features to capture polyp appearance variability is proposed. Expand
...
1
2
...

References

SHOWING 1-10 OF 27 REFERENCES
Online Choice of Active Learning Algorithms
TLDR
Taking an ensemble containing two of the best known active learning algorithms and a new algorithm, the resulting new active learning master algorithm is empirically shown to consistently perform almost as well as and sometimes outperform the best algorithm in the ensemble on a range of classification problems. Expand
Discriminative Batch Mode Active Learning
TLDR
A discriminative batch mode active learning approach that formulates the instance selection task as a continuous optimization problem over auxiliary instance selection variables to maximize the discrim inative classification performance of the target classifier, while also taking the unlabeled data into account. Expand
Efficient Multiclass Boosting Classification with Active Learning
TLDR
The GAMBLE algorithm is formally derive with the quasi-Newton method, and the structural equivalence of the two regression trees in each boosting step is proved, making it highly competitive with state-of-the-art multiclass classification algorithms. Expand
Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions
Active and semi-supervised learning are important techniques when labeled data are scarce. We combine the two under a Gaussian random field model. Labeled and unlabeled data are represented asExpand
Support Vector Machine Active Learning with Applications to Text Classification
TLDR
Experimental results showing that employing the active learning method can significantly reduce the need for labeled training instances in both the standard inductive and transductive settings are presented. Expand
Employing EM and Pool-Based Active Learning for Text Classification
This paper shows how a text classifier’s need for labeled training documents can be reduced by taking advantage of a large pool of unlabeled documents. We modify the Query-by-Committee (QBC) methodExpand
A General Agnostic Active Learning Algorithm
TLDR
This work presents an agnostic active learning algorithm for any hypothesis class of bounded VC dimension under arbitrary data distributions, using reductions to supervised learning that harness generalization bounds in a simple but subtle manner and provides a fall-back guarantee that bounds the algorithm's label complexity by the agnostic PAC sample complexity. Expand
The true sample complexity of active learning
TLDR
It is proved that it is always possible to learn an ε-good classifier with a number of samples asymptotically smaller than this, which contrasts with the traditional analysis of active learning problems such as non-homogeneous linear separators or depth-limited decision trees, in which Ω(1/ε) lower bounds are common. Expand
Semi-supervised MarginBoost
TLDR
Boosting is generalized to this task within the optimization framework of MarginBoost and the margin definition is extended to unlabeled data and the gradient descent algorithm is developed that corresponds to the resulting margin cost function. Expand
A decision-theoretic generalization of on-line learning and an application to boosting
TLDR
The model studied can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting, and the multiplicative weightupdate Littlestone Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems. Expand
...
1
2
3
...