Low-level Bit Research Articles

One can simulate low-precision floating-point arithmetic via software by executing each arithmetic operation in hardware and then rounding the result to the desired number of significant bits. For IEEE-compliant formats, rounding requires only standard mathematical library functions, but handling subnormals, underflow, and overflow demands special attention, and numerical errors can cause mathematically correct formulae to behave incorrectly in finite arithmetic. Moreover, the ensuing implementations are not necessarily efficient, as the library functions these techniques build upon are typically designed to handle a broad range of cases and may not be optimized for the specific needs of rounding algorithms. CPFloat is a C library for simulating low-precision arithmetics. It offers efficient routines for rounding, performing mathematical computations, and querying properties of the simulated low-precision format. The software exploits the bit-level floating-point representation of the format in which the numbers are stored and replaces costly library calls with low-level bit manipulations and integer arithmetic. In numerical experiments, the new techniques bring a considerable speedup (typically one order of magnitude or more) over existing alternatives in C, C++, and MATLAB. To our knowledge, CPFloat is currently the most efficient and complete library for experimenting with custom low-precision floating-point arithmetic.

Read full abstract

With increasing real-time constraints being put on the use of Deep Neural Networks (DNNs) by real-time scenarios, there is the need to review information representation. A very challenging path is to employ an encoding that allows a fast processing and hardware-friendly representation of information. Among the proposed alternatives to the IEEE 754 standard regarding floating point representation of real numbers, the recently introduced Posit format has been theoretically proven to be really promising in satisfying the mentioned requirements. However, with the absence of proper hardware support for this novel type, this evaluation can be conducted only through a software emulation. While waiting for the widespread availability of the Posit Processing Units (the equivalent of the Floating Point Unit (FPU)), we can already exploit the Posit representation and the currently available Arithmetic-Logic Unit (ALU) to speed up DNNs by manipulating the low-level bit string representations of Posits. As a first step, in this paper, we present new arithmetic properties of the Posit number system with a focus on the configuration with 0 exponent bits. In particular, we propose a new class of Posit operators called L1 operators, which consists of fast and approximated versions of existing arithmetic operations or functions (e.g., hyperbolic tangent (TANH) and extended linear unit (ELU)) only using integer arithmetic. These operators introduce very interesting properties and results: (i) faster evaluation than the exact counterpart with a negligible accuracy degradation; (ii) an efficient ALU emulation of a number of Posits operations; and (iii) the possibility to vectorize operations in Posits, using existing ALU vectorized operations (such as the scalable vector extension of ARM CPUs or advanced vector extensions on Intel CPUs). As a second step, we test the proposed activation function on Posit-based DNNs, showing how 16-bit down to 10-bit Posits represent an exact replacement for 32-bit floats while 8-bit Posits could be an interesting alternative to 32-bit floats since their performances are a bit lower but their high speed and low storage properties are very appealing (leading to a lower bandwidth demand and more cache-friendly code). Finally, we point out how small Posits (i.e., up to 14 bits long) are very interesting while PPUs become widespread, since Posit operations can be tabulated in a very efficient way (see details in the text).

Read full abstract

Low-level Bit Research Articles

Related Topics

Articles published on Low-level Bit

CPFloat: A C Library for Simulating Low-precision Arithmetic

Ultrasonic wireless communication using capacitive micromachined ultrasonic transducers in liquid with OOK digital modulation

Fast Approximations of Activation Functions in Deep Neural Networks when using Posit Arithmetic.

TextMap: A general purpose visualization system

High capacity secret image sharing using multilayer image steganography with primary cover predictive error

High capacity secret image sharing using multilayer image steganography with primary cover predictive error

An efficient bit plane X-OR algorithm for irreversible image steganography

Standardized active measurements an a tier 1 IP backbone

An experimental mixed purpose network

A high speed software implementation of the Data Encryption Standard

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Low-level Bit Research Articles

Related Topics

Articles published on Low-level Bit

CPFloat: A C Library for Simulating Low-precision Arithmetic

Ultrasonic wireless communication using capacitive micromachined ultrasonic transducers in liquid with OOK digital modulation

Fast Approximations of Activation Functions in Deep Neural Networks when using Posit Arithmetic.

TextMap: A general purpose visualization system

High capacity secret image sharing using multilayer image steganography with primary cover predictive error

High capacity secret image sharing using multilayer image steganography with primary cover predictive error

An efficient bit plane X-OR algorithm for irreversible image steganography

Standardized active measurements an a tier 1 IP backbone

An experimental mixed purpose network

A high speed software implementation of the Data Encryption Standard