Generalized Optimization: A First Step Towards Category Theoretic Learning Theory

@inproceedings{Shiebler2021GeneralizedOA,
  title={Generalized Optimization: A First Step Towards Category Theoretic Learning Theory},
  author={Dan Shiebler},
  booktitle={ICO},
  year={2021}
}
  • Dan Shiebler
  • Published in ICO 20 September 2021
  • Computer Science, Mathematics
The Cartesian reverse derivative is a categorical generalization of reverse-mode automatic differentiation. We use this operator to generalize several optimization algorithms, including a straightforward generalization of gradient descent and a novel generalization of Newton's method. We then explore which properties of these algorithms are preserved in this generalized setting. First, we show that the transformation invariances of these algorithms are preserved: while generalized Newton's… 

References

SHOWING 1-10 OF 21 REFERENCES

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization

This work describes and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal functions that can be chosen in hindsight.

Reverse derivative categories

A direct axiomatization of a category with a reverse derivative operation is given, in a similar style to that given by Cartesian differential categories for a forward derivative, to show that these linear maps form an additively enriched category with dagger biproducts.

Backprop as Functor: A compositional perspective on supervised learning

A key contribution is the notion of request function, which provides a structural perspective on backpropagation, giving a broad generalisation of neural networks and linking it with structures from bidirectional programming and open games.

The simple essence of automatic differentiation

A simple, generalized AD algorithm calculated from a simple, natural specification, which is inherently parallel-friendly, correct by construction, and usable directly from an existing programming language with no need for new data types or programming style.

Convex Optimization

A comprehensive introduction to the subject of convex optimization shows in detail how such problems can be solved numerically with great efficiency.

Deep Learning

Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.

Disintegration and Bayesian inversion via string diagrams

The existence of disintegration and Bayesian inversion is discussed for discrete probability, and also for measure-theoretic probability – via standard Borel spaces and via likelihoods.

l(Ax)) −1 ∇l(Ax) =