Low-Precision Arithmetic for Fast Gaussian Processes
@article{Maddox2022LowPrecisionAF, title={Low-Precision Arithmetic for Fast Gaussian Processes}, author={Wesley J. Maddox and Andres Potapczynski and Andrew Gordon Wilson}, journal={ArXiv}, year={2022}, volume={abs/2207.06856} }
Low-precision arithmetic has had a transformative effect on the training of neural networks, reducing computation, memory and energy requirements. However, despite its promise, low-precision arithmetic has received little attention for Gaussian process (GP) training, largely because GPs require sophisticated linear algebra routines that are unsta-ble in low-precision. We study the different failure modes that can occur when training GPs in half precision. To circumvent these failure modes, we…
Figures and Tables from this paper
References
SHOWING 1-10 OF 53 REFERENCES
Investigating half precision arithmetic to accelerate dense linear system solvers
- Computer ScienceScalA@SC
- 2017
This work shows for a first time how the use of FP16 arithmetic can significantly accelerate, as well as make more energy efficient, FP32 or FP64-precision Ax = b solvers.
Deep Learning with Limited Numerical Precision
- Computer ScienceICML
- 2015
The results show that deep networks can be trained using only 16-bit wide fixed-point number representation when using stochastic rounding, and incur little to no degradation in the classification accuracy.
Harnessing GPU Tensor Cores for Fast FP16 Arithmetic to Speed up Mixed-Precision Iterative Refinement Solvers
- Computer ScienceSC18: International Conference for High Performance Computing, Networking, Storage and Analysis
- 2018
This investigation presents an investigation showing that other high-performance computing (HPC) applications can also harness this power of floating-point arithmetic, and shows how using half-precision Tensor Cores (FP16-TC) for the arithmetic can provide up to 4× speedup.
Exact Gaussian Processes on a Million Data Points
- Computer ScienceNeurIPS
- 2019
A scalable approach for exact GPs is developed that leverages multi-GPU parallelization and methods like linear conjugate gradients, accessing the kernel matrix only through matrix multiplication, and is generally applicable, without constraints to grid data or specific kernel classes.
SWALP : Stochastic Weight Averaging in Low-Precision Training
- Computer ScienceICML
- 2019
It is shown that SWALP converges arbitrarily close to the optimal solution for quadratic objectives, and to a noise ball asymptotically smaller than low precision SGD in strongly convex settings.
Low-Precision Random Fourier Features for Memory-Constrained Kernel Approximation
- Computer ScienceAISTATS
- 2019
This work proposes using a low-precision quantization of random Fourier features (LP-RFFs) to build a high-rank approximation under a memory budget, and shows quantization has a negligible effect on generalization performance in important settings.
Fast geometric learning with symbolic matrices
- Computer ScienceNeurIPS
- 2020
This paper presents an extension for standard machine learning frameworks that provides comprehensive support for this abstraction on CPUs and GPUs, and performs an extensive evaluation on a broad class of problems: Gaussian modelling, K-nearest neighbors search, geometric deep learning, nonEuclidean embeddings and optimal transport theory.
Dimension-Free Bounds for Low-Precision Training
- Computer ScienceNeurIPS
- 2019
New bounds for low-precision training algorithms that do not contain the dimension $d$ are derived, which lets us better understand what affects the convergence of these algorithms as parameters scale.
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
A quantization scheme is proposed that allows inference to be carried out using integer- only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware.
Revisiting BFloat16 Training
- Computer ScienceArXiv
- 2020
Two simple existing techniques, stochastic rounding and Kahan summation, are identified and empirically show that these two techniques can enable up to 7% absolute validation accuracy gain in pure 16-bit training.