# On Weight-Noise-Injection Training

@inproceedings{Ho2008OnWT, title={On Weight-Noise-Injection Training}, author={Kevin I.-J. Ho and Andrew Chi-Sing Leung and John Sum}, booktitle={ICONIP}, year={2008} }

While injecting weight noise during training has been proposed for more than a decade to improve the convergence, generalization and fault tolerance of a neural network, not much theoretical work has been done to its convergence proof and the objective function that it is minimizing. By applying the Gladyshev Theorem, it is shown that the convergence of injecting weight noise during training an RBF network is almost sure. Besides, the corresponding objective function is essentially the mean…

## Topics from this paper

## 23 Citations

SNIWD: Simultaneous Weight Noise Injection with Weight Decay for MLP Training

- Computer ScienceICONIP
- 2009

Simulation results show that SNIWD is able to improve the convergence and enforce small magnitude on the network parameters (input weights, input biases and output weights) and make the network have similar fault tolerance ability as using pure noise injection approach.

Convergence analysis of on-line weight noise injection training algorithms for MLP networks

- 2010

Injecting weight noise during training has been proposed for almost two decades as a simple technique to improve fault tolerance and generalization of a multilayer perceptron (MLP). However, little…

Note on Weight Noise Injection During Training a MLP

- 2009

Although many analytical works have been done to investigate the change of prediction error of a trained NN if its weights are injected by noise, seldom of them has truly investigated on the…

Convergence Analysis of Multiplicative Weight Noise Injection During Training

- Mathematics2010 International Conference on Technologies and Applications of Artificial Intelligence
- 2010

Injecting weight noise during training has been proposed for almost two decades as a simple technique to improve fault tolerance and generalization of a multilayer perceptron (MLP). However, little…

Objective Functions of Online Weight Noise Injection Training Algorithms for MLPs

- Computer Science, MedicineIEEE Transactions on Neural Networks
- 2011

This work shows that the objective function of the weight noise injection algorithm is different from the prediction error of a faulty MLP whose weights are affected by multiplicative weight noise.

Convergence Analyses on On-Line Weight Noise Injection-Based Training Algorithms for MLPs

- Mathematics, Computer ScienceIEEE Transactions on Neural Networks and Learning Systems
- 2012

This paper studies the convergence of two weight noise injection-based training algorithms, multiplicative weight noise injections with weight decay and additive weight Noise Injection with Weight decay, applied to multilayer perceptrons either with linear or sigmoid output nodes.

On Node-Fault-Injection Training of an RBF Network

- Computer ScienceICONIP
- 2008

Two different node-fault-injection-based on-line learning algorithms, including (1) injecting multinode fault during training and (2) weight decay with injecting mult inode fault, are studied and their almost sure convergence will be proved.

Empirical studies on weight noise injection based online learning algorithms

- 2010

While weight noise injection during training has been adopted in attaining fault tolerant neural networks (NNs), theoretical and empirical studies on the online algorithms developed based on these…

Convergence and objective functions of noise-injected multilayer perceptrons with hidden multipliers

- Computer ScienceNeurocomputing
- 2021

A noise-injected training scheme is proposed, with both multiplicative noise and additive noise taken into consideration, and applications to several UCI datasets have demonstrated that the proposed algorithms have efficient pruning ability and superior generalization ability.

Weight Noise Injection-Based MLPs With Group Lasso Penalty: Asymptotic Convergence and Application to Node Pruning

- Medicine, Computer ScienceIEEE Transactions on Cybernetics
- 2019

A group lasso penalty term is used as a regularizer, where a group is defined by the set of weights connected to a node from nodes in the preceding layer, and enables us to prune redundant hidden nodes.

## References

SHOWING 1-10 OF 38 REFERENCES

On Objective Function, Regularizer, and Prediction Error of a Learning Algorithm for Dealing With Multiplicative Weight Noise

- Mathematics, Computer ScienceIEEE Transactions on Neural Networks
- 2009

The study shows that under some mild conditions the derived regularizer is essentially the same as a weight decay regularizer, which explains why applying weight decay can also improve the fault-tolerant ability of a radial basis function (RBF) with multiplicative weight noise.

On Node-Fault-Injection Training of an RBF Network

- Computer ScienceICONIP
- 2008

Two different node-fault-injection-based on-line learning algorithms, including (1) injecting multinode fault during training and (2) weight decay with injecting mult inode fault, are studied and their almost sure convergence will be proved.

The Effects of Adding Noise During Backpropagation Training on a Generalization Performance

- Computer Science, MathematicsNeural Computation
- 1996

It is shown that input noise and weight noise encourage the neural-network output to be a smooth function of the input or its weights, respectively, and in the weak-noise limit, noise added to the output of the neural networks only changes the objective function by a constant, it cannot improve generalization.

An analysis of noise in recurrent neural networks: convergence and generalization

- Mathematics, Computer ScienceIEEE Trans. Neural Networks
- 1996

Theoretical results show that applying a controlled amount of noise during training may improve convergence and generalization performance, and it is predicted that best overall performance can be achieved by injecting additive noise at each time step.

A Learning Algorithm for Fault Tolerant Feedforward Neural Networks

- Computer Science
- 1996

A new learning algorithm is proposed to enhance fault tolerance ability of the feedforward neural networks by focusing on the links (weights) that may cause errors at the output when they are open faults.

Obtaining Fault Tolerant Multilayer Perceptrons Using an Explicit Regularization

- Mathematics, Computer ScienceNeural Processing Letters
- 2004

The algorithm presented explicitly adds a new term to the backpropagation learning rule related to the mean square error degradation in the presence of weight deviations in order to minimize this degradation.

A Fault-Tolerant Regularizer for RBF Networks

- Computer Science, MedicineIEEE Transactions on Neural Networks
- 2008

Compared with some conventional approaches, including weight-decay-based regularizers, this approach has a better fault-tolerant ability and the empirical study shows that this approach can improve the generalization ability of a fault-free RBF network.

Training with Noise is Equivalent to Tikhonov Regularization

- Mathematics, Computer ScienceNeural Computation
- 1995

This paper shows that for the purposes of network training, the regularization term can be reduced to a positive semi-definite form that involves only first derivatives of the network mapping.

Modifying training algorithms for improved fault tolerance

- Computer ScienceProceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94)
- 1994

Three approaches to improve fault tolerance of neural networks are presented, each of which can obtain better robustness than backpropagation training, and compare favorably with other approaches.

A Quantitative Study of Fault Tolerance, Noise Immunity, and Generalization Ability of MLPs

- Mathematics, MedicineNeural Computation
- 2000

The measurements introduced here are explicitly related to the mean squared error degradation in the presence of perturbations, thus constituting a selection criterion between different alternatives of weight configurations and allowing us to predict the degradation of the learning performance of an MLP when its weights or inputs are deviated from their nominal values.