• Corpus ID: 8993325

Theano: A Python framework for fast computation of mathematical expressions

@article{AlRfou2016TheanoAP,
  title={Theano: A Python framework for fast computation of mathematical expressions},
  author={Rami Al-Rfou and Guillaume Alain and Amjad Almahairi and Christof Angerm{\"u}ller and Dzmitry Bahdanau and Nicolas Ballas and Fr{\'e}d{\'e}ric Bastien and Justin Bayer and Anatoly Belikov and Alexander Belopolsky and Yoshua Bengio and Arnaud Bergeron and James Bergstra and Valentin Bisson and Josh Bleecher Snyder and Nicolas Bouchard and Nicolas Boulanger-Lewandowski and Xavier Bouthillier and Alexandre de Br{\'e}bisson and Olivier Breuleux and Pierre Luc Carrier and Kyunghyun Cho and Jan Chorowski and Paul Francis Christiano and Tim Cooijmans and Marc-Alexandre C{\^o}t{\'e} and Myriam C{\^o}t{\'e} and Aaron C. Courville and Yann Dauphin and Olivier Delalleau and Julien Demouth and Guillaume Desjardins and Sander Dieleman and Laurent Dinh and M{\'e}lanie Ducoffe and Vincent Dumoulin and Samira Ebrahimi Kahou and D. Erhan and Ziye Fan and Orhan Firat and Mathieu Germain and Xavier Glorot and Ian J. Goodfellow and M. Graham and Çaglar G{\"u}lçehre and Philippe Hamel and Iban Harlouchet and Jean-Philippe Heng and Bal{\'a}zs Hidasi and Sina Honari and Arjun Jain and S{\'e}bastien Jean and Kai Jia and Mikhail Korobov and Vivek Kulkarni and Alex Lamb and Pascal Lamblin and Eric Larsen and C{\'e}sar Laurent and Sea Sun Lee and Simon Lefrançois and Simon Lemieux and Nicholas L{\'e}onard and Zhouhan Lin and Jesse A. Livezey and Cory Lorenz and Jeremiah Lowin and Qianli Ma and Pierre-Antoine Manzagol and Olivier Mastropietro and Robert T. McGibbon and Roland Memisevic and Bart van Merrienboer and Vincent Michalski and Mehdi Mirza and Alberto Orlandi and Christopher Joseph Pal and Razvan Pascanu and Mohammad Pezeshki and Colin Raffel and Daniel Renshaw and Matthew Rocklin and Adriana Romero and Markus Roth and Peter Sadowski and John Salvatier and François Savard and Jan Schl{\"u}ter and John Schulman and Gabriel Schwartz and Iulian Serban and Dmitriy Serdyuk and Samira Shabanian and {\'E}tienne Simon and Sigurd Spieckermann and S. Ramana Subramanyam and Jakub Sygnowski and J{\'e}r{\'e}mie Tanguay and Gijs van Tulder and Joseph P. Turian and Sebastian Urban and Pascal Vincent and Francesco Visin and Harm de Vries and David Warde-Farley and Dustin J. Webb and Matthew Willson and Kelvin Xu and Lijun Xue and Li Yao and Saizheng Zhang and Ying Zhang},
  journal={ArXiv},
  year={2016},
  volume={abs/1605.02688}
}
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the… 

Figures from this paper

DIANA Fellowship Proposal
TLDR
Matthew Feickert will investigate the pros and cons of implementing the statistical models used in particle physics with a different computational graph framework, and determine how those frameworks would scale in terms of data and model parallelism.
fastai: A Layered API for Deep Learning
TLDR
This paper has used this library to successfully create a complete deep learning course, which was able to write more quickly than using previous approaches, and the code was more clear.
The 800 Pound Python in the Machine Learning Room
TLDR
The ability to overcome shortcomings by performing a relatively simple source-tosource transformation, that allows for operator overloading techniques to be extended to language built-ins, including control flow operators, function definitions, etc is demonstrated.
A Simple and Efficient Tensor Calculus for Machine Learning
TLDR
This paper shows that using Ricci notation is not necessary for an efficient tensor calculus and develops an equally efficient method for the simpler Einstein notation, and turns out that turning to Einstein notation enables further improvements that lead to even better efficiency.
TensorFlow: A system for large-scale machine learning
TLDR
The TensorFlow dataflow model is described and the compelling performance that Tensor Flow achieves for several real-world applications is demonstrated.
Benchmarking State-of-the-Art Deep Learning Software Tools
TLDR
This paper presents an attempt to benchmark several state-of-the-art GPU-accelerated deep learning software tools, including Caffe, CNTK, TensorFlow, and Torch, and focuses on evaluating the running time performance of these tools with three popular types of neural networks on two representative CPU platforms and three representative GPU platforms.
Accelerating Deep Learning Frameworks with Micro-Batches
cuDNN is a low-level library that provides GPU kernels frequently used in deep learning. Specifically, cuDNN implements several equivalent convolution algorithms, whose performance and memory
The Differentiable Curry
TLDR
The challenge is to produce statically-typed, compile-time, reverse-mode AD, a scenario exemplified by Swift AD, which has to be tackled heads-on to avoid additional complications in a compiler, such as extra inlining and loop unrolling or early defunctionalization, and to allow for separate compilation.
Fast geometric learning with symbolic matrices
TLDR
This paper presents an extension for standard machine learning frameworks that provides comprehensive support for this abstraction on CPUs and GPUs, and performs an extensive evaluation on a broad class of problems: Gaussian modelling, K-nearest neighbors search, geometric deep learning, nonEuclidean embeddings and optimal transport theory.
PyTorch: An Imperative Style, High-Performance Deep Learning Library
TLDR
This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 42 REFERENCES
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
TLDR
The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.
Theano: Deep Learning on GPUs with Python
TLDR
This paper presents Theano, a framework in the Python programming language for defining, optimizing and evaluating expressions involving high-level operations on tensors, and adds automatic symbolic differentiation, GPU support, and faster expression evaluation.
cuDNN: Efficient Primitives for Deep Learning
TLDR
A library similar in intent to BLAS, with optimized routines for deep learning workloads, that contains routines for GPUs, and similarly to the BLAS library, could be implemented for other platforms.
MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
TLDR
The API design and the system implementation of MXNet are described, and it is explained how embedding of both symbolic expression and tensor operation is handled in a unified fashion.
Blocks and Fuel: Frameworks for deep learning
TLDR
This work introduces two Python frameworks to train neural networks on large datasets: Blocks and Fuel, which provides a standard format for machine learning datasets.
Caffe: Convolutional Architecture for Fast Feature Embedding
TLDR
Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.
Training Deep Nets with Sublinear Memory Cost
TLDR
This work designs an algorithm that costs O( √ n) memory to train a n layer network, with only the computational cost of an extra forward pass per mini-batch, and shows that it is possible to trade computation for memory giving a more memory efficient training algorithm with a little extra computation cost.
Torch7: A Matlab-like Environment for Machine Learning
TLDR
Torch7 is a versatile numeric computing framework and machine learning library that extends Lua that can easily be interfaced to third-party software thanks to Lua’s light interface.
Fast Exact Multiplication by the Hessian
TLDR
This work derives a technique that directly calculates Hv, where v is an arbitrary vector, and shows that this technique can be used at the heart of many iterative techniques for computing various properties of H, obviating any need to calculate the full Hessian.
Deep learning with Elastic Averaging SGD
TLDR
Experiments demonstrate that the new algorithm accelerates the training of deep architectures compared to DOWNPOUR and other common baseline approaches and furthermore is very communication efficient.
...
1
2
3
4
5
...