• Corpus ID: 56219261

On Symplectic Optimization

  title={On Symplectic Optimization},
  author={Michael Betancourt and Michael I. Jordan and Ashia C. Wilson},
  journal={arXiv: Computation},
Accelerated gradient methods have had significant impact in machine learning -- in particular the theoretical side of machine learning -- due to their ability to achieve oracle lower bounds. But their heuristic construction has hindered their full integration into the practical machine-learning algorithmic toolbox, and has limited their scope. In this paper we build on recent work which casts acceleration as a phenomenon best explained in continuous time, and we augment that picture by… 

Figures from this paper

Optimization algorithms inspired by the geometry of dissipative systems
Dynamical systems defined through a contact geometry are introduced which are not only naturally suited to the optimization goal but also subsume all previous methods based on geometric dynamical systems, which shows that optimization algorithms that achieve oracle lower bounds on convergence rates can be obtained.
On dissipative symplectic integration with applications to gradient-based optimization
A generalization of symplectic integrators to non-conservative and in particular dissipative Hamiltonian systems is able to preserve rates of convergence up to a controlled error, enabling the derivation of ‘rate-matching’ algorithms without the need for a discrete convergence analysis.
Variational Symplectic Accelerated Optimization on Lie Groups
A Lie group variational discretization based on an extended path space formulation of the Bregman Lagrangian on Lie groups is developed, and its computational properties are analyzed with two examples in attitude determination and vision-based localization.
A Discrete Variational Derivation of Accelerated Methods in Optimization
This paper introduces variational integrators which allow for the derive two families of optimization methods in one-to-one correspondence that generalize Polyak’s heavy ball and the well known Nesterov accelerated gradient method, mimicking the behavior of the latter which reduces the oscillations of typical momentum methods.
Conformal symplectic and relativistic optimization
This work proposes a new algorithm based on a dissipative relativistic system that normalizes the momentum and may result in more stable/faster optimization, and generalizes both Nesterov and heavy ball.
Continuous Time Analysis of Momentum Methods
This work focuses on understanding the role of momentum in the training of neural networks, concentrating on the common situation in which the momentum contribution is fixed at each step of the algorithm, and proves three continuous time approximations of discrete algorithms of the discrete algorithms.
Optimization on manifolds: A symplectic approach
There has been great interest in using tools from dynamical systems and numerical analysis of differential equations to understand and construct new optimization methods. In particular, recently a
Practical Perspectives on Symplectic Accelerated Optimization
This paper investigates how momentum restarting schemes ameliorate computational efficiency and robustness by reducing the undesirable effect of oscillations, and ease the tuning process by making time-adaptivity superfluous.
Conformal Symplectic and Relativistic Optimization
This work proposes a new algorithm based on a dissipative relativistic system that normalizes the momentum and may result in more stable/faster optimization, and generalizes both Nesterov and heavy ball, and has potential advantages at no additional cost.
The Role of Memory in Stochastic Optimization
This work derives a general continuous-time model that can incorporate arbitrary types of memory, for both deterministic and stochastic settings, and provides convergence guarantees for this SDE for weakly-quasi-convex and quadratically growing functions.


A variational perspective on accelerated methods in optimization
A variational, continuous-time framework for understanding accelerated methods is proposed and a systematic methodology for converting accelerated higher-order methods from continuous time to discrete time is provided, which illuminates a class of dynamics that may be useful for designing better algorithms for optimization.
Accelerated Mirror Descent in Continuous and Discrete Time
It is shown that a large family of first-order accelerated methods can be obtained as a discretization of the ODE, and these methods converge at a O(1/k2) rate.
Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent
To the best of the knowledge, this is the first Hessian-free algorithm to find a second-order stationary point faster than GD, and also the first single-loop algorithm with a faster rate than GD even in the setting of finding a first- order stationary point.
On the Nonlinear Stability of Symplectic Integrators
The modified Hamiltonian is used to study the nonlinear stability of symplectic integrators, especially for nonlinear oscillators. We give conditions under which an initial condition on a compact
The Fundamental Incompatibility of Scalable Hamiltonian Monte Carlo and Naive Data Subsampling
It is demonstrated how data subsampling fundamentally compromises the scalability of Hamiltonian Monte Carlo.
Introduction to Smooth Manifolds
Preface.- 1 Smooth Manifolds.- 2 Smooth Maps.- 3 Tangent Vectors.- 4 Submersions, Immersions, and Embeddings.- 5 Submanifolds.- 6 Sard's Theorem.- 7 Lie Groups.- 8 Vector Fields.- 9 Integral Curves
Classical Dynamics: A Contemporary Approach
1. Fundamentals of mechanics 2. Lagrangian formulation of mechanics 3. Topics in Lagrangian dynamics 4. Scattering and linear oscillations 5. Hamiltonian formulation of mechanics 6. Topics in
method: theory and insights
  • Journal of Machine Learning Research,
  • 2016