Photonic reinforcement learning based on optoelectronic reservoir computing

Kazutaka Kanno,Atsushi Uchida

doi:10.1038/s41598-022-07404-z

Kazutaka Kanno, Atsushi Uchida

Open Access

PDF Available

https://doi.org/10.1038/s41598-022-07404-z

Copy DOI

Export

Save

Cite

Journal: Scientific Reports	Publication Date: Mar 8, 2022
Citations: 10	License type: open-access

Affiliation: Saitama University

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Reinforcement learning has been intensively investigated and developed in artificial intelligence in the absence of training data, such as autonomous driving vehicles, robot control, internet advertising, and elastic optical networks. However, the computational cost of reinforcement learning with deep neural networks is extremely high and reducing the learning cost is a challenging issue. We propose a photonic on-line implementation of reinforcement learning using optoelectronic delay-based reservoir computing, both experimentally and numerically. In the proposed scheme, we accelerate reinforcement learning at a rate of several megahertz because there is no required learning process for the internal connection weights in reservoir computing. We perform two benchmark tasks, CartPole-v0 and MountanCar-v0 tasks, to evaluate the proposed scheme. Our results represent the first hardware implementation of reinforcement learning based on photonic reservoir computing and pave the way for fast and efficient reinforcement learning as a novel photonic accelerator.

Highlights

Reinforcement learning has been intensively investigated and developed in artificial intelligence in the absence of training data, such as autonomous driving vehicles, robot control, internet advertising, and elastic optical networks
Reinforcement learning is a machine learning scheme involved in training an action policy to maximize the total reward in a particular situation or environment[5]
Various applications have been studied for reinforcement learning, such as autonomous driving vehicles[6], robot c ontrol[7], communication s ecurity[8], and elastic optical n etworks[9]

Summary

Introduction

Reinforcement learning has been intensively investigated and developed in artificial intelligence in the absence of training data, such as autonomous driving vehicles, robot control, internet advertising, and elastic optical networks. An algorithm based on a deep neural network (Agent57) has been proposed in 2 02010. This scheme has achieved a score that is above the human baseline on all 57 Atari 2600 games. Learning the connection weights of deep neural networks using reinforcement learning entails high computation costs because of the repeated training of network weights from vast playing d ata[14,15]. This fact indicates the need for a large number of parameters used for learning to improve the performance of deep neural networks, known as overparameterization[15–17]. The photonic implementation of reservoir computing based on the idea of photonic accelerators[29] can realize fast information processing with low learning costs[30–35].

Methods

Results

Conclusion