Abstract

Although double-precision floating-point arithmetic currently dominates high-performance computing, there is increasing interest in smaller and simpler arithmetic types. The main reasons are potential improvements in energy efficiency and memory footprint and bandwidth. However, simply switching to lower-precision types typically results in increased numerical errors. We investigate approaches to improving the accuracy of reduced-precision fixed-point arithmetic types, using examples in an important domain for numerical computation in neuroscience: the solution of ordinary differential equations (ODEs). The Izhikevich neuron model is used to demonstrate that rounding has an important role in producing accurate spike timings from explicit ODE solution algorithms. In particular, fixed-point arithmetic with stochastic rounding consistently results in smaller errors compared to single-precision floating-point and fixed-point arithmetic with round-to-nearest across a range of neuron behaviours and ODE solvers. A computationally much cheaper alternative is also investigated, inspired by the concept of dither that is a widely understood mechanism for providing resolution below the least significant bit in digital signal processing. These results will have implications for the solution of ODEs in other subject areas, and should also be directly relevant to the huge range of practical problems that are represented by partial differential equations.This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.

Highlights

  • Introduction and motivationSixty-four-bit double-precision floating-point numbers are the accepted standard for numerical computing because they largely shield the user from concerns over numerical range, accuracy and precision

  • We addressed the numerical accuracy of ODE solvers, solving a well-known neuron model in fixed- and floating-point arithmetics

  • We identified that the constants in the Izhikevich neuron model should be specified explicitly by using the nearest representable number as the GCC fixed-point implementation does round-down in decimal to fixed-point conversion by default

Read more

Summary

Introduction and motivation

Sixty-four-bit double-precision floating-point numbers are the accepted standard for numerical computing because they largely shield the user from concerns over numerical range, accuracy and precision. Where high throughput of arithmetic operations is required, accuracy is sacrificed by reducing the working numerical type from floating- to fixed-point and the word length from 64- or 32-bit to 16- or even 8-bit precision [1]. Mixed-precision arithmetic maintains intermediate results in formats different from the input and output data, whereas stochastic arithmetic works by using probabilistic rounding to balance out the errors in conversions from longer to shorter numerical types.

Background
Related work
SR implementation and testing with atomic multiplies
Ordinary differential equation solvers
Ideas related to Dither and how these may be applicable
Findings
Discussion, further work and conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call