Abstract

Lifelong learning at the edge requires on-the-fly learning from scarce data in one or few shots. Here, we present the array-level demonstration of few-shot learning using a first time fabricated monolithic 3D ternary content addressable memory (M3D-TCAM) using back-end-of-line (BEOL) ferroelectric FETs (FeFETs). The fabricated two-tier structure consists of two 10×10 sub-arrays in each tier and allows massively parallel search operation up to 20-bit long search vectors. We experimentally demonstrate: (a) record low write voltage of ± 1.6V with 20ns write latency for BEOL FeFETs in M3D-TCAM arrays, (b) in situ computation of Hamming distance (nearest neighbor match) between a 20-bit search vector and ten different stored vectors, (c) disturb-free write operation, and (d) high write endurance exceeding 10 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">10</sup> cycles. We experimentally demonstrate a 3-way 3-shot learning with 20-bit feature vectors using Omniglot dataset and achieve an inference accuracy of 70% comparable to GPU accuracy of 72%. System-level benchmarking performed on a 64×512 M3D TCAM with 8 vertically stacked sub-arrays exhibit a 3.5x, 3.7x, 3.5x and 12x improvement in area, search energy, write energy and write latency, respectively, over 2D TCAM.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call