Stochastic Natural Gradient Descent by estimation of empirical covariances


Stochastic relaxation aims at finding the minimum of a fitness function by identifying a proper sequence of distributions, in a given model, that minimize the expected value of the fitness function. Different algorithms fit this framework, and they differ according to the policy they implement to identify the next distribution in the model. In this paper we present two algorithms, in the stochastic relaxation framework, for the optimization of real-valued functions defined over binary variables: Stochastic Gradient Descent (SGD) and Stochastic Natural Gradient Descent (SNDG). These algorithms use a stochastic model to sample from as it happens for Estimation of Distribution Algorithms (EDAs), but the estimation of the model from the population is substituted by the direct update of model parameter through stochastic gradient descent. The two algorithms, SGD and SNDG, both use statistical models in the exponential family, but they differ in the use of the natural gradient, first proposed in the literature by Amari [1], in the context of Information Geometry. Due to the properties of the exponential family, both gradient and natural gradient can be evaluated in terms of covariances between the fitness function and the sufficient statistics of the exponential family. As the computation of the exact gradient is unfeasible, we approximate the gradient by evaluating empirical covariances. We test the performance of our algorithm over different standard benchmarks, and we compare the results with other well-known meta-heuristics in the framework of EDAs.

DOI: 10.1109/CEC.2011.5949720

2 Figures and Tables

Cite this paper

@inproceedings{Malag2011StochasticNG, title={Stochastic Natural Gradient Descent by estimation of empirical covariances}, author={Luigi Malag{\`o} and Matteo Matteucci and Giovanni Pistone}, booktitle={IEEE Congress on Evolutionary Computation}, year={2011} }