Abstract

Reinforcement learning involves decision-making in dynamic and uncertain environments and constitutes a crucial element of artificial intelligence. In our previous work, we experimentally demonstrated that the ultrafast chaotic oscillatory dynamics of lasers can be used to efficiently solve the two-armed bandit problem, which requires decision-making concerning a class of difficult trade-offs called the exploration–exploitation dilemma. However, only two selections were employed in that research; hence, the scalability of the laser-chaos-based reinforcement learning should be clarified. In this study, we demonstrated a scalable, pipelined principle of resolving the multi-armed bandit problem by introducing time-division multiplexing of chaotically oscillated ultrafast time series. The experimental demonstrations in which bandit problems with up to 64 arms were successfully solved are presented where laser chaos time series significantly outperforms quasiperiodic signals, computer-generated pseudorandom numbers, and coloured noise. Detailed analyses are also provided that include performance comparisons among laser chaos signals generated in different physical conditions, which coincide with the diffusivity inherent in the time series. This study paves the way for ultrafast reinforcement learning by taking advantage of the ultrahigh bandwidths of light wave and practical enabling technologies.

Highlights

  • The use of photonics for information processing and artificial intelligence has been intensively studied by exploiting the unique physical attributes of photons

  • We proposed a scalable principle of ultrafast reinforcement learning or decision-making using chaotic time series generated by a laser

  • We experimentally demonstrated that multi-armed bandit problems with N = 2M arms can be successfully solved using M points of signal sampling from the laser chaos and comparison to multiple thresholds

Read more

Summary

Introduction

The use of photonics for information processing and artificial intelligence has been intensively studied by exploiting the unique physical attributes of photons. The latest examples include a coherent Ising machine for combinatorial optimization, photonic reservoir computing to perform complex time-series predictions, and ultrafast random number generation using chaotic dynamics in lasers in which the ultrahigh bandwidth attributes of light bring novel advantages. With a chaotic time series generated by a semiconductor laser with a delayed feedback sampled at a maximum rate of 100 GSample/s followed by a digitization mechanism with a variable threshold, ultrafast, adaptive, and accurate decision-making was demonstrated. Detailed insights into the relations between the resulting decision-making abilities and properties of chaotic signal trains should be pursued to achieve deeper physical understanding as well as performance optimization at the physical or photonic device level

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.