#### Filter Results:

- Full text PDF available (5)

#### Publication Year

2014

2017

- This year (2)
- Last 5 years (6)
- Last 10 years (6)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Key Phrases

Learn More

- Atilim Gunes Baydin, Barak A. Pearlmutter, Alexey Radul
- ArXiv
- 2015

Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD) is a technique for calculating derivatives of numeric functions expressed as computer programs efficiently and accurately, used in fields such as computational fluid dynamics, nuclear engineering, and atmospheric sciences. Despite… (More)

- Atilim Gunes Baydin, Barak A. Pearlmutter
- ArXiv
- 2014

Automatic differentiation—the mechanical transformation of numeric computer programs to calculate derivatives efficiently and accurately—dates to the origin of the computer age. Reverse mode automatic differentiation both antedates and generalizes the method of backwards propagation of errors used in machine learning. Despite this, practitioners in a… (More)

- Tuan Anh Le, Atilim Gunes Baydin, Frank Wood
- AISTATS
- 2017

We introduce a method for using deep neural networks to amortize the cost of inference in models from the family induced by universal probabilistic programming languages, establishing a framework that combines the strengths of probabilistic programming and deep learning methods. We call what we do “compilation of inference” because our method transforms a… (More)

In this paper we introduce DiffSharp, an automatic differentiation (AD) library designed with machine learning in mind. AD is a family of techniques that evaluate derivatives at machine precision with only a small constant factor of overhead, by systematically applying the chain rule of calculus at the elementary operator level. DiffSharp aims to make an… (More)

We introduce a general method for improving the convergence rate of gradient-based optimizers that is easy to implement and works well in practice. We analyze the effectiveness of the method by applying it to stochastic gradient descent, stochastic gradient descent with Nesterov momentum, and Adam, showing that it improves upon these commonly used… (More)

- Atilim Gunes Baydin, Barak A. Pearlmutter
- ArXiv
- 2014

- ‹
- 1
- ›