• Corpus ID: 771841

Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta

  title={Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta},
  author={Richard S. Sutton},
  booktitle={AAAI Conference on Artificial Intelligence},
  • R. Sutton
  • Published in
    AAAI Conference on Artificial…
    12 July 1992
  • Computer Science
Appropriate bias is widely viewed as the key to efficient learning and generalization. I present a new algorithm, the Incremental Delta-Bar-Delta (IDBD) algorithm, for the learning of appropriate biases based on previous learning experience. The IDBD algorithm is developed for the case of a simple, linear learning system--the LMS or delta rule with a separate learning-rate parameter for each input. The IDBD algorithm adjusts the learning-rate parameters, which are an important form of bias for… 

Figures from this paper

Evidence that Incremental Delta-Bar-Delta Is an Attribute-Efficient Linear Learner

This paper presents data that argues that the Incremental Delta-Bar-Delta (IDBD) second-order gradient-descent algorithm is attribute-efficient, performs similarly to Winnow on tasks with many irrelevant attributes, and also does better than Win Now on a task where Winnow does poorly.

Learning Feature Relevance Through Step Size Adaptation in Temporal-Difference Learning

This paper examines an instance of meta-learning in which feature relevance is learned by adapting step size parameters of stochastic gradient descent, and extends IDBD to temporal-difference learning---a form of learning which is effective in sequential, non i.i.d. problems.

An Online Backpropagation Algorithm with Validation Error-Based Adaptive Learning Rate

The proposed algorithm is a heuristic method consisting of two phases that rapidly converges and that it outperforms standard Backpropagation in terms of generalization when the size of the training set is reduced.

Tuning-free step-size adaptation

This paper introduces a series of modifications and normalizations to the IDBD method that together eliminate the need to tune the meta-step-size parameter to the particular problem, and shows that the resulting overall algorithm, called Autostep, performs as well or better than the existing step-size adaptation methods on a number of idealized and robot prediction problems and does not require any tuning of its meta- stepped size parameter.

Sparse Incremental Delta-Bar-Delta for System Identification

Simulations demonstrate that the proposed sparse IDBD algorithm is superior to the competing algorithms in sparse system identification, and can speed up convergence if the system of interest is indeed sparse.

Adaptation to Best Fit Learning Rate in Batch Gradient Descent

A method for the adaptation of learning rate is presented and also solving the problem of slow convergence and exploding of the algorithm is presented.

An Actor-critic Algorithm for Learning Rate Learning

This work proposes an algorithm to automatically learn learning rates using actor-critic methods from reinforcement learning, which leads to good convergence of SGD and can prevent overfitting to a certain extent, resulting in better performance than human-designed competitors.

Online Local Gain Adaptation for Multi-Layer Perceptrons

The resulting ELK1 (extended, linearized K1) algorithm is computationally little more expensive than alternative proposals, and does not require an arbitrary smoothing parameter, and clearly outperforms these alternatives, as well as stochastic gradient descent with momentum.

Vector Step-size Adaptation for Continual, Online Prediction

An instance of AdaGain is introduced, which combines meta-descent with RMSProp, which is particularly robust across several prediction problems and is competitive with the state-of-the-art method on a large-scale, time-series prediction problem on real data from a mobile robot.

TIDBD: Adapting Temporal-difference Step-sizes Through Stochastic Meta-descent

TIDBD is able to find appropriate step-sizes in both stationary and non-stationary prediction tasks, outperforming ordinary TD methods and TD methods with scalar step-size adaptation; it can differentiate between features which are relevant and irrelevant for a given task, performing representation learning; and it is shown on a real-world robot prediction task that TIDBD was able to outperform ordinaryTD methods andTD methods augmented with AlphaBound and RMSprop.



Increased rates of convergence through learning rate adaptation

Layered Concept-Learning and Dynamically Variable Bias Management

A model of concept formation is presented that views learning as a simultaneous optimization problem at three different levels, with dynamically chosen biases guiding the search for satisfactory hypotheses.

Experimental Analysis of the Real-time Recurrent Learning Algorithm

A series of simulation experiments are used to investigate the power and properties of the real-time recurrent learning algorithm, a gradient-following learning algorithm for completely recurrent networks running in continually sampled time.

Goal Seeking Components for Adaptive Intelligence: An Initial Assessment.

It is shown that components designed with attention to the temporal aspects of reinforcement learning can acquire knowledge about feedback pathways in which they are embedded and can use this knowledge to seek their preferred inputs, thus combining pattern recognition, search, and control functions.

Practical Characteristics of Neural Network and Conventional Pattern Classifiers on Artificial and Speech Problems

The results suggest that classifier selection should often depend more heavily on practical considerations concerning memory and computation resources, and restrictions on training and classification times than on error rate.

Acceleration Techniques for the Backpropagation Algorithm

Backpropagation converges slowly, even for medium sized network problems, because of the usually large dimension of the weight space and from the particular shape of the error surface in each iteration point.

ALCOVE: an exemplar-based connectionist model of category learning.

Alcove selectively attends to relevant stimulus dimensions, can account for a form of base-rate neglect, does not suffer catastrophic forgetting, and can exhibit 3-stage learning of high-frequency exceptions to rules, whereas such effects are not easily accounted for by models using other combinations of representation and learning method.

Concept acquisition through representational adjustment

This thesis promotes the hypothesis that the necessary abstractions can be learned and presents a model that relies on a weighted, symbolic description of concepts that should scale-up to larger tasks than those studied and have a number of potential applications.

Accelerated Stochastic Approximation

Convergence with probability 1 is proved for the multidimensional analog of the Kesten accelerated stochastic approximation algorithm.

Adaptive Signal Processing

  • S. Alexander
  • Computer Science
    Texts and Monographs in Computer Science
  • 1986