Abstract

As Machine Learning applications increase the demand for optimised implementations in both embedded and high-end processing platforms, the industry and research community have been responding with different approaches to implement these solutions. This work presents approximations to arithmetic operations and mathematical functions that, associated with a customised adaptive artificial neural networks training method, based on RMSProp, provide reliable and efficient implementations of classifiers. The proposed solution does not rely on mixed operations with higher precision or complex rounding methods that are commonly applied. The intention of this work is not to find the optimal simplifications for specific deep learning problems but to present an optimised framework that can be used as reliably as one implemented with precise operations, standard training algorithms and the same network structures and hyper-parameters. By simplifying the ‘half-precision’ floating point format and approximating exponentiation and square root operations, the authors’ work drastically reduces the field programmable gate array implementation complexity (e.g. −43 and −57% in two of the component resources). The reciprocal square root approximation is so simple it could be implemented only with combination logic. In a full software implementation for a mixed-precision platform, only two of the approximations compensate the processing overhead of precision conversions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call