Heuristic Search for Activation Functions of Neural Networks Based on Gaussian Processes

Xinxing Shi,Jialin Chen,Lingli Wang

doi:10.1109/ijcnn52387.2021.9533641

Xinxing Shi, Jialin Chen + Show 1 more

https://doi.org/10.1109/ijcnn52387.2021.9533641

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Despite the powerful expressivity of neural networks with nonlinear activation functions, the underlying mechanism for deep neural networks still remains unclear. However, it can be proved that ultra-wide neural networks are equivalent to Gaussian processes, thus connecting the analysis on neural networks with Bayesian statistics and kernel methods. Moreover, recent studies on infinitely wide neural networks extend this correspondence to a specific kernel, named Neural Tangent Kernel (NTK), which governs the learning dynamics of related neural networks. Without weights and biases, the NTK recursively encodes the architecture information about the corresponding neural networks, including the activation function at each hidden layer. Inspired by this close relationship of Gaussian processes and neural networks, we propose a heuristic search method for activation functions of sufficiently wide neural networks in the NTK regime. To obtain an elegant and closed-form computation, activation functions are decomposed in the basis of Hermite polynomials, which converts the kernels in Gaussian processes into power series. Experiments show the outperformance of the obtained nonlinearities compared with other common activation functions. This work also reveals the potential utility of NTKs for guidance on neural network structure search in the future.

Full Text