Abstract

Reinforcement learning involves decision making in dynamic and uncertain environments and constitutes an important element of artificial intelligence (AI). In this work, we experimentally demonstrate that the ultrafast chaotic oscillatory dynamics of lasers efficiently solve the multi-armed bandit problem (MAB), which requires decision making concerning a class of difficult trade-offs called the exploration–exploitation dilemma. To solve the MAB, a certain degree of randomness is required for exploration purposes. However, pseudorandom numbers generated using conventional electronic circuitry encounter severe limitations in terms of their data rate and the quality of randomness due to their algorithmic foundations. We generate laser chaos signals using a semiconductor laser sampled at a maximum rate of 100 GSample/s, and combine it with a simple decision-making principle called tug of war with a variable threshold, to ensure ultrafast, adaptive, and accurate decision making at a maximum adaptation speed of 1 GHz. We found that decision-making performance was maximized with an optimal sampling interval, and we highlight the exact coincidence between the negative autocorrelation inherent in laser chaos and decision-making performance. This study paves the way for a new realm of ultrafast photonics in the age of AI, where the ultrahigh bandwidth of light wave can provide new value.

Highlights

  • Reinforcement learning involves decision making in dynamic and uncertain environments and constitutes an important element of artificial intelligence (AI)

  • This paper experimentally demonstrates the usefulness of ultrafast chaotic oscillatory dynamics in semiconductor lasers for reinforcement learning, which is among the most important elements in machine learning

  • We experimentally established that laser chaos provides ultrafast reinforcement learning and decision making

Read more

Summary

Introduction

Reinforcement learning involves decision making in dynamic and uncertain environments and constitutes an important element of artificial intelligence (AI). New photonic processing principles have recently emerged to solve complex time-series prediction problems[2,3,4], and issues in spatiotemporal dynamics[5] and combinatorial optimization[6], which coincide with the rapid shift to the age of artificial intelligence (AI) These novel approaches exploit the ultrahigh bandwidth attributes of light wave and their enabling device technologies[2,3,6]. The intelligence of slime moulds or amoebae, single-cell natural organisms, has been used in solution searches, whereby complex intercellular spatiotemporal dynamics play a key role[21] This has stimulated the subsequent discovery of a new principle of decision-making strategy called tug of war (TOW), invented by Kim et al.[22,23]. The name TOW is a metaphor used to represent such a nonlocal correlation while accommodating fluctuation, which enhances decision-making performance[23]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call