Abstract

A low-power online deep neural network (DNN) training processor is proposed for a real-time object tracking in mobile devices. For a real-time object tracking, a homogeneous core architecture is proposed to achieve 1.33× higher throughput than previous DNN training processor. To reduce the external memory access (EMA), a binary feedback alignment (BFA) algorithm and an integral run-length compression (iRLC) decoder are proposed. While the BFA reduces the EMA by 11.4% compared to the conventional back-propagation approach, the iRLC decoder achieves 29.7% EMA reduction without throughput degradation. Finally, a dropout controller is proposed and achieves 43.9% power reduction through clock-gating. Implemented with 65 nm CMOS technology, the 4.4 mm2 DNN training processor achieves 141.1 mW power consumption at 30.4 frames-per-second (fps) real-time object tracking in mobile devices.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.