Abstract

Abstract Evaluating the log-sum-exp function or the softmax function is a key step in many modern data science algorithms, notably in inference and classification. Because of the exponentials that these functions contain, the evaluation is prone to overflow and underflow, especially in low-precision arithmetic. Software implementations commonly use alternative formulas that avoid overflow and reduce the chance of harmful underflow, employing a shift or another rewriting. Although mathematically equivalent, these variants behave differently in floating-point arithmetic and shifting can introduce subtractive cancellation. We give rounding error analyses of different evaluation algorithms and interpret the error bounds using condition numbers for the functions. We conclude, based on the analysis and numerical experiments, that the shifted formulas are of similar accuracy to the unshifted ones, so can safely be used, but that a division-free variant of softmax can suffer from loss of accuracy.

Highlights

  • In many applications, especially in a wide range of machine learning classifiers such as multinomial linear regression and naive Bayes classifiers (Calafiore et al, 2019), (Murphy, 2012), (Williams & Barber, 1998), one needs to compute an expression of the form n y = f (x) = log ∑ exi, i=1 (1.1)where x = [x1, x2, . . . , xn]T ∈ Rn and log is the natural logarithm

  • We used this data in our implementation of the softmax and log-sum-exp algorithms that we have studied in the previous sections

  • The log-sum-exp and softmax functions both feature in many computational pipelines, so it is important to compute them accurately and to avoid generating infs or NaNs because of overflow or underflow

Read more

Summary

Introduction

A way to avoid overflow, and to attempt to avoid harmful underflow and subnormal numbers, in evaluating log-sum-exp is to rewrite n n n. The conciseness of this division-free formula makes it attractive for implementing softmax when a log-sum-exp function is available. Because of the importance of the log-sum-exp and softmax functions, great efforts are made to optimize their implementations in software (Czaja et al, 2019) and hardware (Wang et al, 2018). By investigating the conditioning of the log-sum-exp and softmax functions.

Condition numbers and Forward Stability
Algorithms with shifting
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call