Learning State Space Trajectories in Recurrent Neural Networks
@article{Pearlmutter1989LearningSS, title={Learning State Space Trajectories in Recurrent Neural Networks}, author={Barak A. Pearlmutter}, journal={Neural Computation}, year={1989}, volume={1}, pages={263-269} }
Many neural network learning procedures compute gradients of the errors on the output layer of units after they have settled to their final values. We describe a procedure for finding E/wij, where E is an error functional of the temporal trajectory of the states of a continuous recurrent network and wij are the weights of that network. Computing these quantities allows one to perform gradient descent in the weights to minimize E. Simulations in which networks are taught to move through limit…
745 Citations
Generating network trajectories using gradient descent in state space
- Computer Science1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227)
- 1998
A local and simple learning algorithm is introduced that gradually minimizes an error function for neural states of a general network based on linearizing the neurodynamics which are interpreted as constraints for the different network variables.
A simplex optimization approach for recurrent neural network training and for learning time-dependent trajectory patterns
- Computer ScienceIJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339)
- 1999
This work describes a learning procedure that does not require gradient evaluations and hence offers significant implementation advantages, and exploits the inherent properties of nonlinear simplex optimization in realizing these advantages.
Initial state training procedure improves dynamic recurrent networks with time-dependent weights
- Computer ScienceIEEE Trans. Neural Networks
- 2001
The problem of learning multiple continuous trajectories by means of recurrent neural networks with (in general) time-varying weights is addressed. The learning process is transformed into an optimal…
Recurrent neural networks for temporal learning of time series
- Computer ScienceIEEE International Conference on Neural Networks
- 1993
The learning and performance behaviors of recurrent 3-layer perceptrons for time-dependent input and output data are studied and the Ring Array Processor is used to cope with the increased learning time.
A Learning Algorithm for Continually Running Fully Recurrent Neural Networks
- Computer ScienceNeural Computation
- 1989
The exact form of a gradient-following learning algorithm for completely recurrent networks running in continually sampled time is derived and used as the basis for practical algorithms for temporal…
Bifurcations of Recurrent Neural Networks in Gradient Descent Learning
- Computer Science
- 1993
Some of the factors underlying successful training of recurrent networks are investigated, such as choice of initial connections, choice of input patterns, teacher forcing, and truncated learning equations.
Recurrent Backpropagation and the Dynamical Approach to Adaptive Neural Computation
- Computer ScienceNeural Computation
- 1989
It is now possible to efficiently compute the error gradients for networks that have temporal dynamics, which opens applications to a host of problems in systems identification and control.
Learning scheme for recurrent neural network by genetic algorithm
- Computer ScienceProceedings of 1993 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '93)
- 1993
A new learning scheme for recurrent neural networks using a genetic algorithm is presented and used to determine the interconnection weights and the GA approach is compared with backpropagation through time.
Learning temporal patterns in recurrent neural networks
- Computer Science1990 IEEE International Conference on Systems, Man, and Cybernetics Conference Proceedings
- 1990
General learning algorithms for recurrent neural networks that can be used for both discrete-time and continuous-time models are described. They are based on the notion of the derivatives of mappings…
Training trajectories by continuous recurrent multilayer networks
- Computer ScienceIEEE Trans. Neural Networks
- 2002
A training algorithm based upon a variational formulation of Pontryagin's maximum principle is proposed for continuous recurrent neural networks whose feedforward parts are multilayer perceptrons that approximate a general nonlinear dynamic system with arbitrary accuracy.
References
SHOWING 1-10 OF 27 REFERENCES
A Learning Algorithm for Continually Running Fully Recurrent Neural Networks
- Computer ScienceNeural Computation
- 1989
The exact form of a gradient-following learning algorithm for completely recurrent networks running in continually sampled time is derived and used as the basis for practical algorithms for temporal…
Generalization of back-propagation to recurrent neural networks.
- Computer SciencePhysical review letters
- 1987
An adaptive neural network with asymmetric connections is introduced that bears a resemblance to the master/slave network of Lapedes and Farber but it is architecturally simpler.
Learning state space trajectories in recurrent neural networks : a preliminary report.
- Computer Science
- 1988
A procedure for finding learning state space trajectories in recurrent neural networks by minimizing functionals and connectionism is described.
Generalization of backpropagation with application to a recurrent gas market model
- MathematicsNeural Networks
- 1988
Analysis of Recurrent Backpropagation
- Computer Science
- 1988
This paper attempts a systematic analysis of the recurrent backpropagation (RBP) algorithm, introducing a number of new results, and shows that the introduction of a non local search technique such as simulated annealing has a dramatic effect on a network's ability to learn patterns.
Phoneme recognition using time-delay neural networks
- Computer ScienceIEEE Trans. Acoust. Speech Signal Process.
- 1989
The authors present a time-delay neural network (TDNN) approach to phoneme recognition which is characterized by two important properties: (1) using a three-layer arrangement of simple computing…
A Steepest-Ascent Method for Solving Optimum Programming Problems
- Business
- 1962
By repeating this process in small steps, a control variable program that minimizes one quantity and yields specified values of other terminal quantities can be approached as closely as desired.
Optimization by Simulated Annealing
- PhysicsScience
- 1983
A detailed analogy with annealing in solids provides a framework for optimization of the properties of very large and complex systems.