• Corpus ID: 5878784

Learning Active Learning from Data

  title={Learning Active Learning from Data},
  author={Ksenia Konyushkova and Raphael Sznitman and Pascal V. Fua},
In this paper, we suggest a novel data-driven approach to active learning (AL). The key idea is to train a regressor that predicts the expected error reduction for a candidate sample in a particular learning state. By formulating the query selection procedure as a regression problem we are not restricted to working with existing AL heuristics; instead, we learn strategies based on experience from previous AL outcomes. We show that a strategy can be learnt either from simple synthetic 2D… 

Figures and Tables from this paper

ImitAL: Learning Active Learning Strategies from Synthetic Data

This work proposes IMITAL, a novel query strategy, which encodes AL as a learning-to-rank problem, and shows that the approach is more runtime performant than most other strategies, especially on very large datasets.

Active Learning: Problem Settings and Recent Developments

The basic problem settings of active learning and recent research trends are explained, and research on learning acquisition functions to select samples from the data for labeling, theoretical work on active learning algorithms, and stopping criteria for sequential data acquisition are highlighted.

ALdataset: a benchmark for pool-based active learning

To conduct easier comparative evaluation among AL methods, a benchmark task for pool-based active learning is presented, which consists of benchmarking datasets and quantitative metrics that summarize overall performance.

Learning to Actively Learn: A Robust Approach

This work proposes a procedure for designing algorithms for specific adaptive data collection tasks like active learning and pure-exploration multi-armed bandits, and performs synthetic experiments to justify the stability and effectiveness of the training procedure, and then evaluates the method on tasks derived from real data.

Learning Active Learning at the Crossroads? Evaluation and Discussion

A benchmark performed on 20 datasets that compares a strategy learned using a recent meta-learning algorithm with margin sampling, reported in recent comparative studies as a very competitive heuristic is presented.

Distribution Aware Active Learning

This work proposes a query criterion for active learning that is aware of distribution of data and is more robust against outliers, and suggests a probabilistic generative model which acts as a teacher in this model.

Towards robust episodic meta-learning

This work proposes to compose episodes to robustify meta-learning in the few-shot setting in order to learn more efficiently and to generalize better to new tasks, and makes use of active learning scoring rules to select the data to be included in the episodes.

Bias-Aware Heapified Policy for Active Learning

This paper proposes a bias-aware policy network called Heapified Active Learning (HAL), which prevents overconfidence, and improves sample efficiency of policy learning by heapified structure without ignoring global inforamtion(overview of the whole unlabeled set).

Learning about the learning process: from active querying to fine-tuning

This doctoral thesis upgrades the pre-and post-processing steps of the machine learning pipeline with the learningto-learn paradigm, and develops a new learning-to-learning technique to improve the effectiveness and efficiency of fine-tuning-based transfer learning.

Learning How to Active Learn by Dreaming

Experimental results show that the dream-based AL policy training strategy is more effective than applying the pretrained policy without further fine-tuning and better than the existing strong baseline methods that use heuristics or reinforcement learning.



Active Learning Literature Survey

This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date.

Query by Committee Made Real

A new algorithm, KQBC, is introduced, capable of actively learning large scale problems by using selective sampling, which overcomes the costly sampling step of the well known Query By Committee (QBC) algorithm by projecting onto a low dimensional space.

An Analysis of Active Learning Strategies for Sequence Labeling Tasks

This paper surveys previously used query selection strategies for sequence models, and proposes several novel algorithms to address their shortcomings, and conducts a large-scale empirical comparison.

Can Active Learning Experience Be Transferred?

A novel active learning model that linearly aggregates existing strategies is proposed and the learned experience not only is competitive with existing strategies on most single datasets, but also can be transferred across datasets to improve the performance on future learning tasks.

Active Learning by Learning

A learning algorithm that connects active learning with the well-known multi-armed bandit problem is designed and it is postulated that, given an appropriate choice for the multi-arm bandit learner, it is possible to estimate the performance of different strategies on the fly.

Active Learning by Querying Informative and Representative Examples

The proposed QUIRE approach provides a systematic way for measuring and combining the informativeness and representativeness of an unlabeled instance by incorporating the correlation among labels and is extended to multi-label learning by actively querying instance-label pairs.

A literature survey of active machine learning in the context of natural language processing

Active learning has been successfully applied to a number of natural language processing tasks, such as, information extraction, named entity recognition, text categorization, part-of-speech tagging, parsing, and word sense disambiguation.

Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization

This paper proposes a semi-supervised batch mode multi-class active learning algorithm for visual concept recognition that exploits the whole active pool to evaluate the uncertainty of the data, and proposes to make the selected data as diverse as possible.

RALF: A reinforced active learning formulation for object class recognition

This paper analyzes different sampling criteria including a novel density-based criteria and demonstrates the importance to combine exploration and exploitation sampling criteria, and proposes a novel feedback-driven framework based on reinforcement learning.

Multi-class active learning for image classification

An uncertainty measure is proposed that generalizes margin-based uncertainty to the multi-class case and is easy to compute, so that active learning can handle a large number of classes and large data sizes efficiently.