Corpus ID: 202540736

Accurate Computation of the Log-Sum-Exp and Softmax Functions

@article{Blanchard2019AccurateCO,
  title={Accurate Computation of the Log-Sum-Exp and Softmax Functions},
  author={Pierre Blanchard and Desmond J. Higham and Nicholas John Higham},
  journal={ArXiv},
  year={2019},
  volume={abs/1909.03469}
}
Evaluating the log-sum-exp function or the softmax function is a key step in many modern data science algorithms, notably in inference and classification. Because of the exponentials that these functions contain, the evaluation is prone to overflow and underflow, especially in low precision arithmetic. Software implementations commonly use alternative formulas that avoid overflow and reduce the chance of harmful underflow, employing a shift or another rewriting. Although mathematically… Expand
Quantum State Tomography as a Bilevel Problem, Utilizing I-Q Plane Data
We formulate quantum state tomography as a bilevel optimization problem so as to utilize the actual measurements, such as the so-called I-Q plane data obtained in the dispersive readout of transmonExpand
Finding the Optimal Network Depth in Classification Tasks
TLDR
A fast end-to-end method for training lightweight neural networks using multiple classifier heads by allowing the model to determine the importance of each head and rewarding the choice of a single shallow classifier, which reduces the number of parameters and accelerates inference across different hardware processing units. Expand
On Missing Mass Variance
TLDR
This work determines what is the maximal variance of the missing mass, for any sample and alphabet sizes, and the result helps in understanding theMissing mass concentration properties. Expand
Regularity and stability of feedback relaxed controls
TLDR
First-order monotone convergence of the value functions for relaxed control problems with vanishing exploration parameters is proved, which subsequently enables the pure exploitation strategy of the original control problem based on the feedback relaxed controls. Expand

References

SHOWING 1-10 OF 25 REFERENCES
Fast and correctly rounded logarithms in double-precision
TLDR
This article is a case study in the implementation of a portable, proven and efficient correctly rounded elementary function in double-precision using the crlibm library to get performance equivalent to the best current mathematical libraries. Expand
Accuracy and stability of numerical algorithms
TLDR
This book gives a thorough, up-to-date treatment of the behavior of numerical algorithms in finite precision arithmetic by combining algorithmic derivations, perturbation theory, and rounding error analysis. Expand
Handbook of Floating-Point Arithmetic
TLDR
The Handbook of Floating-point Arithmetic is designed for programmers of numerical applications, compiler designers, programmers of floating-point algorithms, designers of arithmetic operators, and more generally, students and researchers in numerical analysis who wish to better understand a tool used in their daily work and research. Expand
The Accuracy of Floating Point Summation
  • N. Higham
  • Mathematics, Computer Science
  • SIAM J. Sci. Comput.
  • 1993
TLDR
Five summation methods and their variations are analyzed here and no one method is uniformly more accurate than the others, but some guidelines are given on the choice of method in particular cases. Expand
Log-Sum-Exp Neural Networks and Posynomial Models for Convex and Log-Log-Convex Data
In this paper, we show that a one-layer feedforward neural network with exponential activation functions in the inner layer and logarithmic activation in the output neuron is a universal approximatorExpand
Functions of matrices - theory and computation
A thorough and elegant treatment of the theory of matrix functions and numerical methods for computing them, including an overview of applications, new and unpublished research results, and improvedExpand
The Matrix Unwinding Function, with an Application to Computing the Matrix Exponential
TLDR
A new matrix function corresponding to the scalar unwinding number of Corless, Hare, and Jeffrey is introduced, and it is shown that matrix argument reduction using the function $\mathcal{U}(A)$, which has eigenvalues with imaginary parts in the interval $(-\pi,\pi)$, can give significant computational savings in the evaluation of the exponential by scaling and squaring algorithms. Expand
Simulating Low Precision Floating-Point Arithmetic
The half-precision (fp16) floating-point format, defined in the 2008 revision of the IEEE standard for floating-point arithmetic, and a more recently proposed half-precision format bfloat16, are in...
Machine learning - a probabilistic perspective
  • K. Murphy
  • Computer Science
  • Adaptive computation and machine learning series
  • 2012
TLDR
This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students. Expand
Deep Learning: An Introduction for Applied Mathematicians
TLDR
This article provides a very brief introduction to the basic ideas that underlie deep learning from an applied mathematics perspective and illustrates the ideas with a short MATLAB code that sets up and trains a network. Expand
...
1
2
3
...