Author pages are created from data sourced from our academic publisher partnerships and public sources.
Share This Author
Reconciling modern machine-learning practice and the classical bias–variance trade-off
- Mikhail Belkin, Daniel J. Hsu, Siyuan Ma, Soumik Mandal
- Computer ScienceProceedings of the National Academy of Sciences
- 24 July 2019
This work shows how classical theory and modern practice can be reconciled within a single unified performance curve and proposes a mechanism underlying its emergence, and provides evidence for the existence and ubiquity of double descent for a wide spectrum of models and datasets.
The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning
The key observation is that most modern learning architectures are over-parametrized and are trained to interpolate the data by driving the empirical loss close to zero, so it is still unclear why these interpolated solutions perform well on test data.
To understand deep learning we need to understand kernel learning
It is argued that progress on understanding deep learning will be difficult until more tractable "shallow" kernel methods are better understood, and a need for new theoretical ideas for understanding properties of classical kernel methods.
Reconciling modern machine learning and the bias-variance trade-off
A new "double descent" risk curve is exhibited that extends the traditional U-shaped bias-variance curve beyond the point of interpolation and shows that the risk of suitably chosen interpolating predictors from these models can, in fact, be decreasing as the model complexity increases, often below the risk achieved using non-interpolating models.
On exponential convergence of SGD in non-convex over-parametrized learning
It is argued that the PL condition provides a relevant and attractive setting for many machine learning problems, particularly in the over-parametrized regime.
Reconciling modern machine learning practice and the bias-variance trade-off
This paper reconciles the classical understanding and the modern practice within a unified performance curve that subsumes the textbook U-shaped bias-variance trade-off curve by showing how increasing model capacity beyond the point of interpolation results in improved performance.
S-CAVE: Effective SSD caching to improve virtual machine storage performance
- Tian Luo, Siyuan Ma, Rubao Lee, Xiaodong Zhang, Deng Liu, Li Zhou
- Computer ScienceProceedings of the 22nd International Conference…
- 7 October 2013
The design and implementation of S-CAVE, a hypervisor-based SSD caching facility, which effectively manages a storage cache in a Multi-VM environment by collecting and exploiting runtime information from both VMs and storage devices is presented.
Concurrent Analytical Query Processing with GPUs
Concurrent query execution as an effective solution to efficiently share GPUs among concurrent queries for high throughput and relies on GPU query scheduling and device memory swapping policies to address this challenge.
Diving into the shallows: a computational perspective on large-scale shallow learning
EigenPro iteration is introduced, based on a preconditioning scheme using a small number of approximately computed eigenvectors, which turns out that injecting this small (computationally inexpensive and SGD-compatible) amount of approximate second-order information leads to major improvements in convergence.
Kernel Machines That Adapt To Gpus For Effective Large Batch Training
This paper develops the first analytical framework that extends linear scaling to match the parallel computing capacity of a resource, designed for a class of classical kernel machines.