A Method for Calculating the Derivative of Activation Functions Based on Piecewise Linear Approximation

Xuan Liao,Tong Zhou,Yuanxi Peng,Longlong Zhang,Xiang Hu

doi:10.3390/electronics12020267

Xuan Liao, Tong Zhou + Show 3 more

Open Access

PDF Available

https://doi.org/10.3390/electronics12020267

Copy DOI

Export

Save

Cite

Journal: Electronics	Publication Date: Jan 4, 2023
Citations: 2	License type: CC BY 4.0

Affiliation: National University of Defense Technology

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Nonlinear functions are widely used as activation functions in artificial neural networks, which have a great impact on the fitting ability of artificial neural networks. Due to the complexity of the activation function, the computation of the activation function and its derivative requires a lot of computing resources and time during training. In order to improve the computational efficiency of the derivatives of the activation function in the back-propagation of artificial neural networks, this paper proposes a method based on piecewise linear approximation method to calculate the derivative of the activation function. This method is hardware-friendly and universal, it can efficiently compute various nonlinear activation functions in the field of neural network hardware accelerators. In this paper, we use least squares to improve a piecewise linear approximation calculation method that can control the absolute error and get less number of segments or smaller average error, which means fewer hardware resources are required. We use this method to perform a segmented linear approximation to the original or derivative function of the activation function. Both types of activation functions are substituted into a multilayer perceptron for binary classification experiments to verify the effectiveness of the proposed method. Experimental results show that the same or even slightly higher classification accuracy can be achieved by using this method, and the computation time of the back-propagation is reduced by 4–6% compared to the direct calculation of the derivative directly from the function expression using the operator encapsulated in PyTorch. This shows that the proposed method provides an efficient solution of nonlinear activation functions for hardware acceleration of neural networks.

Full Text