Learning Functors using Gradient Descent

  title={Learning Functors using Gradient Descent},
  author={Bruno Gavranovic},
Neural networks are a general framework for differentiable optimization which includes many other machine learning approaches as special cases. In this paper we build a category-theoretic formalism around a neural network system called CycleGAN. CycleGAN is a general approach to unpaired image-to-image translation that has been getting attention in the recent years. Inspired by categorical database systems, we show that CycleGAN is a "schema", i.e. a specific category presented by generators… 

Figures from this paper

A Probabilistic Generative Model of Free Categories
It is shown how acyclic directed wiring diagrams can model specifications for morphisms, which the model can use to generate morphisms and the free category prior achieves competitive reconstruction performance on the Omniglot dataset.
Reverse Derivative Ascent: A Categorical Approach to Learning Boolean Circuits
Reverse Derivative Ascent is introduced: a categorical analogue of gradient based methods for machine learning that allows us to learn the parameters of boolean circuits directly, in contrast to existing binarised neural network approaches.
Category Theory in Machine Learning
This work aims to document the motivations, goals and common themes across these applications of category theory in machine learning, touching on gradient-based learning, probability, and equivariant learning.


Backprop as Functor: A compositional perspective on supervised learning
A key contribution is the notion of request function, which provides a structural perspective on backpropagation, giving a broad generalisation of neural networks and linking it with structures from bidirectional programming and open games.
Generative Adversarial Nets
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a
Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
The architecture introduced in this paper learns a mapping function G : X 7→ Y using an adversarial loss such thatG(X) cannot be distinguished from Y , whereX and Y are images belonging to two
Adam: A Method for Stochastic Optimization
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Learning to learn by gradient descent by gradient descent
This paper shows how the design of an optimization algorithm can be cast as a learning problem, allowing the algorithm to learn to exploit structure in the problems of interest in an automatic way.
Decoupled Neural Interfaces using Synthetic Gradients
It is demonstrated that in addition to predicting gradients, the same framework can be used to predict inputs, resulting in models which are decoupled in both the forward and backwards pass -- amounting to independent networks which co-learn such that they can be composed into a single functioning corporation.
Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data
This work proposes a new model, called Augmented CycleGAN, which learns many-to-many mappings between domains, and examines it qualitatively and quantitatively on several image datasets.
Functorial data migration
Database queries and constraints via lifting problems
  • David I. Spivak
  • Computer Science
    Mathematical Structures in Computer Science
  • 2013
This paper shows that certain queries and constraints correspond to lifting problems, as found in modern approaches to algebraic topology, and explains how giving users access to certain parts of Qry(π), rather than direct access to π, improves the ability to manage the impact of schema evolution.
Improved Training of Wasserstein GANs
This work proposes an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input, which performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning.