Abstract

Many applications, such as machine learning and sensor data analysis, are statistical in nature and can tolerate some level of inaccuracy in their computation. Approximate computing is a viable method to save energy and increase performance by controllably trading off energy for accuracy. In this paper, we propose a tiered approximate floating point multiplier, called CFPU, which significantly reduces energy consumption and improves the performance of multiplication at a slight cost in accuracy. The floating point multiplication is approximated by replacing the costly mantissa multiplication step of the operation with lower energy alternatives. We process the data by using one of the three modes: a basic approximate mode, an intermediate approximate mode, or on the exact hardware, depending on the accuracy requirements. We evaluate the efficiency of the proposed CFPU on a wide range of applications including twelve general OpenCL ones and three machine learning applications. Our results show that using the first CFPU approximation mode results in $3.5\times $ energy-delay product (EDP) improvement, compared to a GPU using traditional floating point units (FPUs), while ensuring less than 10% average relative error. Adding the second mode further increases the EDP improvement to $4.1\times $ , compared to an unmodified FPU, for less than 10% error. In addition, our results show that the proposed CFPU can achieve $2.8\times $ EDP improvement for multiply operations as compared to state-of-the-art approximate multipliers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call