Robustness of Bayesian Pool-Based Active Learning Against Prior Misspecification

  title={Robustness of Bayesian Pool-Based Active Learning Against Prior Misspecification},
  author={Cuong V Nguyen and Nan Ye and Wee Sun Lee},
We study the robustness of active learning (AL) algorithms against prior misspecification: whether an algorithm achieves similar performance using a perturbed prior as compared to using the true prior. In both the average and worst cases of the maximum coverage setting, we prove that all alpha-approximate algorithms are robust (i.e., near alpha-approximate) if the utility is Lipschitz continuous in the prior. We further show that robustness may not be achieved if the utility is non… 

Figures and Tables from this paper

Bayesian Pool-based Active Learning with Abstention Feedbacks

A Bayesian approach is taken to the problem and two new greedy algorithms are developed that learn both the classification problem and the unknown abstention rate at the same time, and it is proved that both have near-optimality guarantees.

Maximize Pointwise Cost-sensitively Submodular Functions With Budget Constraint

It is proved that two simple greedy policies for the worst-case adaptive optimization problem with budget constraint are not near-optimal but the best between them is near- optimize, and a combined policy is proposed that is nearoptimal with respect to the optimal worst- case policy that uses half of the budget.

Bayesian Active Learning With Abstention Feedbacks

Adaptive Maximization of Pointwise Submodular Functions With Budget Constraint

This paper investigates the near-optimality of greedy algorithms for this problem with both modular and non-modular cost functions and proves that two simple greedy algorithms are not near-optimal but the best between them is near- optimize if the utility function satisfies pointwise submodularity and pointwise cost-sensitive submodular respectively.

A Comparative Survey: Benchmarking for Pool-based Active Learning

This paper surveys and compares various AL strategies used in both recently proposed and classic highly-cited methods, and proposes to benchmark pool-based AL methods with a variety of datasets and quantitative metric, and draws insights from the comparative empirical results.

ALdataset: a benchmark for pool-based active learning

To conduct easier comparative evaluation among AL methods, a benchmark task for pool-based active learning is presented, which consists of benchmarking datasets and quantitative metrics that summarize overall performance.

A Structured Perspective of Volumes on Active Learning



Near-optimal Adaptive Pool-based Active Learning with General Loss

A third greedy active learning criterion is considered, the Gibbs error criterion, and it is shown that it is able to achieve a constant factor approximation to the optimal version space reduction in a worst-case setting, where the probability of labelings that have not been eliminated is considered as the version space.

Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization

It is proved that if a problem satisfies adaptive submodularity, a simple adaptive greedy algorithm is guaranteed to be competitive with the optimal policy, providing performance guarantees for both stochastic maximization and coverage.

Near-optimal Batch Mode Active Learning and Adaptive Submodular Optimization

It is proved that for batch mode active learning and more general information-parallel stochastic optimization problems that exhibit adaptive submodularity, a natural diminishing returns condition, a simple greedy strategy is competitive with the optimal batch-mode policy.

Active Learning for Probabilistic Hypotheses Using the Maximum Gibbs Error Criterion

The experimental results on a named entity recognition task and a text classification task show that the maximum Gibbs error criterion is an effective active learning criterion for noisy models.

Submodularity in Data Subset Selection and Active Learning

The connection of submodularity to the data likelihood functions for Naive Bayes and Nearest Neighbor classifiers is shown, and the data subset selection problems for these classifiers are formulated as constrained submodular maximization.

Active Learning Literature Survey

This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date.

Batch mode active learning and its application to medical image classification

A framework for "batch mode active learning" that applies the Fisher information matrix to select a number of informative examples simultaneously and is more effective than the state-of-the-art algorithms for active learning.

Analysis of a greedy active learning strategy

The core search problem of active learning schemes is abstract out, and it is proved that a popular greedy active learning rule is approximately as good as any other strategy for minimizing this number of labels.

Adaptive informative path planning in metric spaces

Recursive Adaptive Identification (RAId), a new polynomial-time approximation algorithm for adaptive IPP, is presented and a polylogarithmic approximation bound when the robot travels in a metric space is proved.

Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid

A new algorithm, NBTree, is proposed, which induces a hybrid of decision-tree classifiers and Naive-Bayes classifiers: the decision-Tree nodes contain univariate splits as regular decision-trees, but the leaves contain Naïve-Bayesian classifiers.