A Model of Inductive Bias Learning

  title={A Model of Inductive Bias Learning},
  author={Jonathan Baxter},
A major problem in machine learning is that of inductive bias: how to choose a learner's hypothesis space so that it is large enough to contain a solution to the problem being learnt, yet small enough to ensure reliable generalization from reasonably-sized training sets. Typically such bias is supplied by hand through the skill and insights of experts. In this paper a model for automatically learning bias is investigated. The central assumption of the model is that the learner is embedded… 

Figures from this paper

Towards a theory of out-of-distribution learning

The introduction of learning efficiency, which quantifies the amount a learner is able to leverage data for a given problem, regardless of whether it is an in- or out-of-distribution problem, is introduced.

Learning-to-Learn Stochastic Gradient Descent with Biased Regularization

A key feature of the results is that, when the number of tasks grows and their variance is relatively small, the learning-to-learn approach has a significant advantage over learning each task in isolation by Stochastic Gradient Descent without a bias term.

Lifelong learning and inductive bias

  • Ron AmitR. Meir
  • Computer Science
    Current Opinion in Behavioral Sciences
  • 2019

Transfer learning in a heterogeneous environment

  • Andreas MaurerM. Pontil
  • Computer Science
    2012 3rd International Workshop on Cognitive Information Processing (CIP)
  • 2012
A method for transfer learning, in which tasks encountered in the past are used to choose a representation which is expected to work well on future tasks, and the expected error of this method is shown to be uniformly bounded by the empirical error criterion.

A theory of transfer learning with applications to active learning

This work explores a transfer learning setting, in which a finite sequence of target concepts are sampled independently with an unknown distribution from a known family, and finds that the number of labeled examples required for learning with transfer is often significantly smaller than that required forlearning each target independently.

Transfer Meta-Learning: Information- Theoretic Bounds and Information Meta-Risk Minimization

Novel information-theoretic upper bounds are obtained on the transfer meta-generalization gap, which measures the difference between the meta-training loss and the average loss on meta-test data from a new, randomly selected, task in the target task environment.

Boosting a Model Zoo for Multi-Task and Continual Learning

This work shows using tools in statistical learning theory how tasks can compete for capacity, i.e., including a particular task can deteriorate the accuracy on a given task, and that the ideal set of tasks that one should train together in order to perform well on agiven task is different for different tasks.

A picture of the space of typical learnable tasks

We develop a technique to analyze representations learned by deep networks when they are trained on different tasks using supervised, meta- and contrastive learning. We develop a technique to

Multi-task Learning

  • M. Pontil
  • Computer Science
    Transfer Learning
  • 2020
This talk will describe techniques to solve the underlying optimization problems and present an analysis of the generalization performance of these learning methods which provides a proof of the superiority of multi-task learning under specific conditions.

PAC-Bayesian Meta-Learning: From Theory to Practice

A theoretical analysis using the PAC-Bayesian framework is provided and the first bound for meta-learners with unbounded loss functions is derived, thereby avoiding the reliance on nested optimization and giving rise to an optimization problem amenable to standard variational methods that scale well.



Learning internal representations

It is proved that the number of examples required to ensure good generalisation from a representation learner obeys and that gradient descent can be used to train neural network representations and experiment results are reported providing strong qualitative support for the theoretical results.

Shift of bias for inductive concept learning

It is shown that search for appropriate bias is itself a major part of the learning task, and that mechanical procedures for conducting a well-directed search for an appropriate bias can be created.

Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications

  • D. Haussler
  • Mathematics, Computer Science
    Inf. Comput.
  • 1992

Layered Concept-Learning and Dynamically Variable Bias Management

A model of concept formation is presented that views learning as a simultaneous optimization problem at three different levels, with dynamically chosen biases guiding the search for satisfactory hypotheses.

Solving a Huge Number of Similar Tasks: A Combination of Multi-Task Learning and a Hierarchical Bayesian Approach

In this paper, we propose a machine-learning solution to problems consisting of many similar prediction tasks. Each of the individual tasks has a high risk of overrtting. We combine two types of

Learning One More Thing

Results on learning to recognize objects from color images demonstrate superior generalization capabilities if invariances are learned and used to bias subsequent learning.

Discovering Structure in Multiple Learning Tasks: The TC Algorithm

The task-clustering algorithm TC clusters learning tasks into classes of mutually related tasks, and outperforms its non-selective counterpart in situations where only a small number of tasks is relevant.

Learning to Learn

This chapter discusses Reinforcement Learning with Self-Modifying Policies J. Schmidhuber, et al., and theoretical Models of Learning to Learn J. Baxter, a first step towards Continual Learning.

Repeat Learning Using Predicate Invention

A new predicate invention mechanism implemented in Progol4.4 is used in repeat learning experiments within a chess domain and the results indicate that significant performance increases can be achieved.

Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta

A new algorithm, the Incremental Delta-Bar-Delta (IDBD) algorithm, for the learning of appropriate biases based on previous learning experience, and a novel interpretation of the IDBD algorithm as an incremental form of hold-one-out cross validation.