# On Symplectic Optimization

@article{Betancourt2018OnSO, title={On Symplectic Optimization}, author={Michael Betancourt and Michael I. Jordan and Ashia C. Wilson}, journal={arXiv: Computation}, year={2018} }

Accelerated gradient methods have had significant impact in machine learning -- in particular the theoretical side of machine learning -- due to their ability to achieve oracle lower bounds. But their heuristic construction has hindered their full integration into the practical machine-learning algorithmic toolbox, and has limited their scope. In this paper we build on recent work which casts acceleration as a phenomenon best explained in continuous time, and we augment that picture by…

## 72 Citations

### Optimization algorithms inspired by the geometry of dissipative systems

- Computer Science
- 2019

Dynamical systems defined through a contact geometry are introduced which are not only naturally suited to the optimization goal but also subsume all previous methods based on geometric dynamical systems, which shows that optimization algorithms that achieve oracle lower bounds on convergence rates can be obtained.

### On dissipative symplectic integration with applications to gradient-based optimization

- Computer Science, Mathematics
- 2020

A generalization of symplectic integrators to non-conservative and in particular dissipative Hamiltonian systems is able to preserve rates of convergence up to a controlled error, enabling the derivation of ‘rate-matching’ algorithms without the need for a discrete convergence analysis.

### Variational Symplectic Accelerated Optimization on Lie Groups

- Computer Science2021 60th IEEE Conference on Decision and Control (CDC)
- 2021

A Lie group variational discretization based on an extended path space formulation of the Bregman Lagrangian on Lie groups is developed, and its computational properties are analyzed with two examples in attitude determination and vision-based localization.

### Conformal symplectic and relativistic optimization

- Physics, Computer ScienceNeurIPS
- 2020

This work proposes a new algorithm based on a dissipative relativistic system that normalizes the momentum and may result in more stable/faster optimization, and generalizes both Nesterov and heavy ball.

### Continuous Time Analysis of Momentum Methods

- Computer ScienceJ. Mach. Learn. Res.
- 2021

This work focuses on understanding the role of momentum in the training of neural networks, concentrating on the common situation in which the momentum contribution is fixed at each step of the algorithm, and proves three continuous time approximations of discrete algorithms of the discrete algorithms.

### DYNAMICAL, SYMPLECTIC AND STOCHASTIC PERSPECTIVES ON GRADIENT-BASED OPTIMIZATION

- Computer ScienceProceedings of the International Congress of Mathematicians (ICM 2018)
- 2019

This work goes beyond classical gradient flow to focus on second-order dynamics, aiming to show the relevance of such dynamics to optimization algorithms that not only converge, but converge quickly.

### Optimization on manifolds: A symplectic approach

- Mathematics
- 2021

There has been great interest in using tools from dynamical systems and numerical analysis of differential equations to understand and construct new optimization methods. In particular, recently a…

### Practical Perspectives on Symplectic Accelerated Optimization

- Computer Science
- 2022

This paper investigates how momentum restarting schemes ameliorate computational efficiency and robustness by reducing the undesirable effect of oscillations, and ease the tuning process by making time-adaptivity superfluous.

### Conformal Symplectic and Relativistic Optimization

- Physics, Computer Science
- 2020

This work proposes a new algorithm based on a dissipative relativistic system that normalizes the momentum and may result in more stable/faster optimization, and generalizes both Nesterov and heavy ball, and has potential advantages at no additional cost.

### The Role of Memory in Stochastic Optimization

- Computer ScienceUAI
- 2019

This work derives a general continuous-time model that can incorporate arbitrary types of memory, for both deterministic and stochastic settings, and provides convergence guarantees for this SDE for weakly-quasi-convex and quadratically growing functions.

## References

SHOWING 1-10 OF 13 REFERENCES

### A variational perspective on accelerated methods in optimization

- Computer ScienceProceedings of the National Academy of Sciences
- 2016

A variational, continuous-time framework for understanding accelerated methods is proposed and a systematic methodology for converting accelerated higher-order methods from continuous time to discrete time is provided, which illuminates a class of dynamics that may be useful for designing better algorithms for optimization.

### Geometric Numerical Integration: Structure Preserving Algorithms for Ordinary Differential Equations

- Physics
- 2004

### Accelerated Mirror Descent in Continuous and Discrete Time

- Computer ScienceNIPS
- 2015

It is shown that a large family of first-order accelerated methods can be obtained as a discretization of the ODE, and these methods converge at a O(1/k2) rate.

### On the Nonlinear Stability of Symplectic Integrators

- Mathematics, Physics
- 2004

The modified Hamiltonian is used to study the nonlinear stability of symplectic integrators, especially for nonlinear oscillators. We give conditions under which an initial condition on a compact…

### A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights

- Computer ScienceJ. Mach. Learn. Res.
- 2016

A second-order ordinary differential equation is derived, which is the limit of Nesterov's accelerated gradient method, and it is shown that the continuous time ODE allows for a better understanding of Nestersov's scheme.

### The Fundamental Incompatibility of Scalable Hamiltonian Monte Carlo and Naive Data Subsampling

- PhysicsICML
- 2015

It is demonstrated how data subsampling fundamentally compromises the scalability of Hamiltonian Monte Carlo.

### Simulating Hamiltonian dynamics

- PhysicsMath. Comput.
- 2006

Reading simulating hamiltonian dynamics is a way as one of the collective books that gives many advantages and will greatly develop your experiences about everything.

### Introduction to Smooth Manifolds

- Mathematics
- 2002

Preface.- 1 Smooth Manifolds.- 2 Smooth Maps.- 3 Tangent Vectors.- 4 Submersions, Immersions, and Embeddings.- 5 Submanifolds.- 6 Sard's Theorem.- 7 Lie Groups.- 8 Vector Fields.- 9 Integral Curves…

### Classical Dynamics: A Contemporary Approach

- Physics
- 1998

1. Fundamentals of mechanics 2. Lagrangian formulation of mechanics 3. Topics in Lagrangian dynamics 4. Scattering and linear oscillations 5. Hamiltonian formulation of mechanics 6. Topics in…

### Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent

- Computer ScienceCOLT
- 2018

To the best of the knowledge, this is the first Hessian-free algorithm to find a second-order stationary point faster than GD, and also the first single-loop algorithm with a faster rate than GD even in the setting of finding a first- order stationary point.