We investigate how the activation function can be used to describe neural firing in an abstract way, and in turn, why it works well in artificial neural networks. We discuss how a spike in a biological neuron belongs to a particular universality class of phase transitions in statistical physics. We then show that the artificial neuron is, mathematically, a mean-field model of biological neural membrane dynamics, which arises from modeling spiking as a phase transition. This allows us to treat selective neural firing in an abstract way and formalize the role of the activation function in perceptron learning. The resultant statistical physical model allows us to recover the expressions for some known activation functions as various special cases. Along with deriving this model and specifying the analogous neural case, we analyze the phase transition to understand the physics of neural network learning. Together, it is shown that there is not only a biological meaning but a physical justification for the emergence and performance of typical activation functions; implications for neural learning and inference are also discussed.
Read full abstract