Ubiquitous vision sensors continuously collect data from the environment, imposing a significant processing burden. Especially, high-frame-rate convolution processing for large-scale pixel arrays presents substantial challenges. This paper proposes a mixed-signal sensing with computing macro featuring analog compression and maximum parallelism for high-frame-rate object detection tasks. A customized neural network (NN) architecture is designed specifically for object detection. Key circuitry modules, including signal readout, dark current offset, normalization, first-layer convolution, and pooling, are implemented in the analog domain, leading to high-quality information extraction and substantial data compression. Maximum parallelism in the first-layer convolution is realized in our proposed processing flow, ensuring high-frame-rate convolution for large-scale pixel arrays. Additionally, the subsequent NN layers are efficiently deployed on SRAM-based in-memory digital computing. The macro was designed using a 55-nm foundry process. Results show that our macro can achieve 1000 fps@360,000 pixels with a 63.8 TOPS throughput under 4/8-bit hybrid precision, outperforming the state-of-the-art in frame rate by 2–10 times.
Read full abstract