Improved Adaptive Algorithm for Scalable Active Learning with Weak Labeler

  title={Improved Adaptive Algorithm for Scalable Active Learning with Weak Labeler},
  author={Yifang Chen and Karthik Abinav Sankararaman and Alessandro Lazaric and Matteo Pirotta and Dmytro Karamshuk and Qifan Wang and Karishma Mandyam and Sinong Wang and Han Fang},
Active learning with strong and weak labelers considers a practical setting where we have access to both costly but accurate strong labelers and inaccurate but cheap predictions provided by weak labelers. We study this problem in the streaming setting, where decisions must be taken online . We design a novel algorithmic template, Weak Labeler Active Cover (WL-AC), that is able to robustly leverage the lower quality weak labelers to reduce the query complexity while retaining the desired level… 

Figures from this paper



Cost-Effective Active Learning from Diverse Labelers

This paper proposes a novel active selection criterion to evaluate the cost-effectiveness of instance-labeler pairs, which ensures that the selected instance is helpful for improving the classification model, and meanwhile the selected labeler can provide an accurate label for the instance with a relative low cost.

Learning from Weak Teachers

A formal framework for such learning scenarios with label sources of varying quality, and a parametric model for such label sources (“weak teachers”) is proposed, reflecting the intuition that their labeling is likely to be correct in label-homogeneous regions but may deteriorate near classification boundaries.

Active Learning from Weak and Strong Labelers

An active learning algorithm is provided that is able to learn a classifier with low error on data labeled by the oracle, while using the weak labeler to reduce the number of label queries made to this labeler.

Active Learning for Crowdsourcing Using Knowledge Transfer

This paper proposes a new probabilistic model that transfers knowledge from abundant unlabeled data in auxiliary domains to help estimate labelers' expertise and presents a novel active learning algorithm that simultaneously selects the most informative example and queries its label from the labeler with the best expertise.

Active Learning from Multiple Knowledge Sources

This work focuses on maximizing the information that an annotator label provides about the true (but unknown) label of the data point and proposes an AL approach for this new scenario motivated by information theoretic principles.

Learning Loss for Active Learning

  • Donggeun YooIn-So Kweon
  • Computer Science
    2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
A novel active learning method that is simple but task-agnostic, and works efficiently with the deep networks, by attaching a small parametric module, named ``loss prediction module,'' to a target network, and learning it to predict target losses of unlabeled inputs.

Active Learning Literature Survey

This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date.

Theory of Disagreement-Based Active Learning

Recent advances in the understanding of the theoretical benefits of active learning are described, and implications for the design of effective active learning algorithms are described.

A General Agnostic Active Learning Algorithm

This work presents an agnostic active learning algorithm for any hypothesis class of bounded VC dimension under arbitrary data distributions, using reductions to supervised learning that harness generalization bounds in a simple but subtle manner and provides a fall-back guarantee that bounds the algorithm's label complexity by the agnostic PAC sample complexity.

Active Learning for Convolutional Neural Networks: A Core-Set Approach

This work defines the problem of active learning as core-set selection as choosing set of points such that a model learned over the selected subset is competitive for the remaining data points, and presents a theoretical result characterizing the performance of any selected subset using the geometry of the datapoints.