Neural Schrödinger Equation: Physical Law as Deep Neural Network.

Toshikazu Hashimoto,Mitsumasa Nakajima,Kenji Tanaka

doi:10.1109/tnnls.2021.3120472

Toshikazu Hashimoto, Mitsumasa Nakajima + Show 1 more

Open Access

https://doi.org/10.1109/tnnls.2021.3120472

Copy DOI

Abstract

We show a new family of neural networks based on the Schrödinger equation (SE-NET). In this analogy, the trainable weights of the neural networks correspond to the physical quantities of the Schrödinger equation. These physical quantities can be trained using the complex-valued adjoint method. Since the propagation of the SE-NET can be described by the evolution of physical systems, its outputs can be computed by using a physical solver. The trained network is transferable to actual optical systems. As a demonstration, we implemented the SE-NET with the Crank-Nicolson finite difference method on Pytorch. From the results of numerical simulations, we found that the performance of the SE-NET becomes better when the SE-NET becomes wider and deeper. However, the training of the SE-NET was unstable due to gradient explosions when SE-NET becomes deeper. Therefore, we also introduced phase-only training, which only updates the phase of the potential field (refractive index) in the Schrödinger equation. This enables stable training even for the deep SE-NET model because the unitarity of the system is kept under the training. In addition, the SE-NET enables a joint optimization of physical structures and digital neural networks. As a demonstration, we performed a numerical demonstration of end-to-end machine learning (ML) with an optical frontend toward a compact spectrometer. Our results extend the application field of ML to hybrid physical-digital optimizations.

Highlights

M ACHINE learning (ML) based on artificial deep neural networks (DNNs) have a remarkable ability to learn and generalize from data
We propose a new building block for neural networks based on the Schrödinger equation (SE-NET) as the first demonstration of a physical neural ordinary differential equation (ODE) [Fig. 1(a)]
To compute the z-axis evolution of SE-NET numerically, we implemented the finite difference beam propagation method (FD-BPM) solver with the Crank–Nicholson method on the Pytorch framework

Summary

INTRODUCTION

M ACHINE learning (ML) based on artificial deep neural networks (DNNs) have a remarkable ability to learn and generalize from data. Yildiz et al [29] have shown that it is possible to model the residual layer of ResNet as a continuous ordinary differential equation (ODE) Such networks are called neural ODE networks (neural ODE or ODE-Nets), and they offer improved memory efficiency and performance comparable to that of regular DNNs. interesting is that ODE-Nets can be trained by the adjoint sensitivity method using the standard ODE solver, which is commonly used for the inverse design of physical systems [30], [31]. Interesting is that ODE-Nets can be trained by the adjoint sensitivity method using the standard ODE solver, which is commonly used for the inverse design of physical systems [30], [31] It has been shown high-order ODEs [29] or partial differential equations (PDEs) can be considered as a model for neural networks [32].

Theory

Application to Optics

Image Classification

End-to-End Compressed Sensing Spectroscopy

DISCUSSION

Neural ODE

Complex-Valued Neural Network

Quantum ML

Dynamics-Inspired Optimization Solver

CONCLUSION