A Reconfigurable Two‐WSe2‐Transistor Synaptic Cell for Reinforcement Learning

Yue Zhou,Zijian Tang,Yuhui He,Yang Chai,Yi Li,Jianmiao Guo,Jingli Wang,Yasai Wang,Sijie Ma,Xiangshui Miao,Fuwei Zhuge

doi:10.1002/adma.202107754

Abstract

Reward-modulated spike-timing-dependent plasticity (R-STDP) is a brain-inspired reinforcement learning (RL) rule, exhibiting potential for decision-making tasks and artificial general intelligence. However, the hardware implementation of the reward-modulation process in R-STDP usually requires complicated Si complementary metal-oxide-semiconductor (CMOS) circuit design that causes high power consumption and large footprint. Here, a design with two synaptic transistors (2T) connected in a parallel structure is experimentally demonstrated. The 2T unit based on WSe2 ferroelectric transistors exhibits reconfigurable polarity behavior, where one channel can be tuned as n-type and the other as p-type due to nonvolatile ferroelectric polarization. In this way, opposite synaptic weight update behaviors with multilevel (>6 bit) conductance states, ultralow nonlinearity (0.56/-1.23), and large Gmax /Gmin ratio of 30 are realized. By applying positive/negative reward to (anti-)STDP component of 2T cell, R-STDP learning rules are realized for training the spiking neural network and demonstrated to solve the classical cart-pole problem, exhibiting a way for realizing low-power (32 pJ per forward process) and highly area-efficient (100 µm2 ) hardware chip for reinforcement learning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Reconfigurable Two‐WSe2‐Transistor Synaptic Cell for Reinforcement Learning

Abstract

Talk to us

Similar Papers

More From: Advanced Materials

Lead the way for us

Journal: Advanced Materials	Publication Date: Feb 25, 2022
Citations: 66

Similar Papers

A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback
Robert Legenstein ... Dejan Pecevski
PLoS Computational Biology | VOL. 4
Robert Legenstein, et. al.Robert Legenstein ... Dejan Pecevski
10 Oct 2008
PLoS Computational Biology | VOL. 4

First-Spike-Based Visual Categorization Using Reward-Modulated STDP.
Milad Mozafari ... Mohammad Ganjtabesh
IEEE Transactions on Neural Networks and Learning Systems | VOL. 29
Milad Mozafari, et. al.Milad Mozafari ... Mohammad Ganjtabesh
08 May 2018
IEEE Transactions on Neural Networks and Learning Systems | VOL. 29

Exploiting Memristors for Neuromorphic Reinforcement Learning
Cong Shi ... Jing Lu
-
Cong Shi, et. al.Cong Shi ... Jing Lu
06 Jun 2021
06 Jun 2021

A Lightweight Spiking GAN Model for Memristor-centric Silicon Circuit with On-chip Reinforcement Adversarial Learning
Min Tian ... Haibing Wang
-
Min Tian, et. al.Min Tian ... Haibing Wang
28 May 2022
28 May 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Reconfigurable Two‐WSe2‐Transistor Synaptic Cell for Reinforcement Learning

Abstract

Talk to us

Similar Papers

More From: Advanced Materials