# Accurate Computation of the Log-Sum-Exp and Softmax Functions

@article{Blanchard2019AccurateCO, title={Accurate Computation of the Log-Sum-Exp and Softmax Functions}, author={Pierre Blanchard and Desmond J. Higham and Nicholas John Higham}, journal={ArXiv}, year={2019}, volume={abs/1909.03469} }

Evaluating the log-sum-exp function or the softmax function is a key step in many modern data science algorithms, notably in inference and classification. Because of the exponentials that these functions contain, the evaluation is prone to overflow and underflow, especially in low precision arithmetic. Software implementations commonly use alternative formulas that avoid overflow and reduce the chance of harmful underflow, employing a shift or another rewriting. Although mathematically… Expand

#### Figures, Tables, and Topics from this paper

#### 4 Citations

Quantum State Tomography as a Bilevel Problem, Utilizing I-Q Plane Data

- Physics, Mathematics
- 2021

We formulate quantum state tomography as a bilevel optimization problem so as to utilize the actual measurements, such as the so-called I-Q plane data obtained in the dispersive readout of transmon… Expand

Finding the Optimal Network Depth in Classification Tasks

- Computer Science, Mathematics
- ECML/PKDD
- 2020

A fast end-to-end method for training lightweight neural networks using multiple classifier heads by allowing the model to determine the importance of each head and rewarding the choice of a single shallow classifier, which reduces the number of parameters and accelerates inference across different hardware processing units. Expand

On Missing Mass Variance

- Computer Science, Mathematics
- ArXiv
- 2021

This work determines what is the maximal variance of the missing mass, for any sample and alphabet sizes, and the result helps in understanding theMissing mass concentration properties. Expand

Regularity and stability of feedback relaxed controls

- Mathematics, Computer Science
- SIAM Journal on Control and Optimization
- 2021

First-order monotone convergence of the value functions for relaxed control problems with vanishing exploration parameters is proved, which subsequently enables the pure exploitation strategy of the original control problem based on the feedback relaxed controls. Expand

#### References

SHOWING 1-10 OF 25 REFERENCES

Fast and correctly rounded logarithms in double-precision

- Computer Science, Mathematics
- RAIRO Theor. Informatics Appl.
- 2007

This article is a case study in the implementation of a portable, proven and efficient correctly rounded elementary function in double-precision using the crlibm library to get performance equivalent to the best current mathematical libraries. Expand

Accuracy and stability of numerical algorithms

- Computer Science, Mathematics
- 1991

This book gives a thorough, up-to-date treatment of the behavior of numerical algorithms in finite precision arithmetic by combining algorithmic derivations, perturbation theory, and rounding error analysis. Expand

Handbook of Floating-Point Arithmetic

- Computer Science
- 2009

The Handbook of Floating-point Arithmetic is designed for programmers of numerical applications, compiler designers, programmers of floating-point algorithms, designers of arithmetic operators, and more generally, students and researchers in numerical analysis who wish to better understand a tool used in their daily work and research. Expand

The Accuracy of Floating Point Summation

- Mathematics, Computer Science
- SIAM J. Sci. Comput.
- 1993

Five summation methods and their variations are analyzed here and no one method is uniformly more accurate than the others, but some guidelines are given on the choice of method in particular cases. Expand

Log-Sum-Exp Neural Networks and Posynomial Models for Convex and Log-Log-Convex Data

- Computer Science, Mathematics
- IEEE Transactions on Neural Networks and Learning Systems
- 2020

In this paper, we show that a one-layer feedforward neural network with exponential activation functions in the inner layer and logarithmic activation in the output neuron is a universal approximator… Expand

Functions of matrices - theory and computation

- Computer Science, Mathematics
- 2008

A thorough and elegant treatment of the theory of matrix functions and numerical methods for computing them, including an overview of applications, new and unpublished research results, and improved… Expand

The Matrix Unwinding Function, with an Application to Computing the Matrix Exponential

- Computer Science, Mathematics
- SIAM J. Matrix Anal. Appl.
- 2014

A new matrix function corresponding to the scalar unwinding number of Corless, Hare, and Jeffrey is introduced, and it is shown that matrix argument reduction using the function $\mathcal{U}(A)$, which has eigenvalues with imaginary parts in the interval $(-\pi,\pi)$, can give significant computational savings in the evaluation of the exponential by scaling and squaring algorithms. Expand

Simulating Low Precision Floating-Point Arithmetic

- Computer Science, Mathematics
- SIAM J. Sci. Comput.
- 2019

The half-precision (fp16) floating-point format, defined in the 2008 revision of the IEEE standard for floating-point arithmetic, and a more recently proposed half-precision format bfloat16, are in...

Machine learning - a probabilistic perspective

- Computer Science
- Adaptive computation and machine learning series
- 2012

This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students. Expand

Deep Learning: An Introduction for Applied Mathematicians

- Mathematics, Computer Science
- SIAM Rev.
- 2019

This article provides a very brief introduction to the basic ideas that underlie deep learning from an applied mathematics perspective and illustrates the ideas with a short MATLAB code that sets up and trains a network. Expand