Abstract

Deep-learning models have become pervasive tools in science and engineering. However, their energy requirements now increasingly limit their scalability1. Deep-learning accelerators2–9 aim to perform deep learning energy-efficiently, usually targeting the inference phase and often by exploiting physical substrates beyond conventional electronics. Approaches so far10–22 have been unable to apply the backpropagation algorithm to train unconventional novel hardware in situ. The advantages of backpropagation have made it the de facto training method for large-scale neural networks, so this deficiency constitutes a major impediment. Here we introduce a hybrid in situ–in silico algorithm, called physics-aware training, that applies backpropagation to train controllable physical systems. Just as deep learning realizes computations with deep neural networks made from layers of mathematical functions, our approach allows us to train deep physical neural networks made from layers of controllable physical systems, even when the physical layers lack any mathematical isomorphism to conventional artificial neural network layers. To demonstrate the universality of our approach, we train diverse physical neural networks based on optics, mechanics and electronics to experimentally perform audio and image classification tasks. Physics-aware training combines the scalability of backpropagation with the automatic mitigation of imperfections and noise achievable with in situ algorithms. Physical neural networks have the potential to perform machine learning faster and more energy-efficiently than conventional electronic processors and, more broadly, can endow physical systems with automatically designed physical functionalities, for example, for robotics23–26, materials27–29 and smart sensors30–32.

Highlights

  • The emerging DNN energy problem has inspired special-purpose hardware: DNN ‘accelerators’[2,3,4,5,6,7,8], most of which are based on direct mathematical isomorphism between the hardware physics and the mathematical operations in DNNs (Fig. 1a, b)

  • Input data and parameters are encoded in the input pulses’ spectra, and outputs are obtained from the frequency-doubled pulses’ spectra. d, Like DNNs constructed from sequences of trainable nonlinear mathematical functions, we construct deep physical neural networks (PNNs) with sequences of trainable physical transformations

  • PNNs are well motivated for DNN-like calculations, much more so than for digital logic or even other forms of analogue computation. As expected from their robust processing of natural data, DNNs and physical processes share numerous structural similarities, such as hierarchy, approximate symmetries, noise, redundancy and nonlinearity[36]. They perform transformations that are effectively equivalent to approximations, variants and/ or combinations of the mathematical operations commonly used in DNNs, such as convolutions, nonlinearities and matrix-vector multiplications

Read more

Summary

Section 3.

To train the PNNs’ parameters using backpropagation, we use PAT (Fig. 3). Automatic differentiation determines the gradient of a loss function with respect to trainable parameters. The key component of PAT is the use of mismatched forward and backward passes in executing the backpropagation algorithm. This technique is well known in neuromorphic computing[48,49,50,51,52,53], appearing recently in direct feedback alignment[52] and quantization-aware training[48], which inspired PAT. The parameters are updated according to the inferred gradient This process is repeated, iterating over training examples, to reduce the error.

80 Digital baseline nal test accuracy
Discussion
Methods
Findings
Code availability
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call