Abstract
A deep neural network (DNN) online learning processor is proposed with high throughput and low power consumption to achieve real-time object tracking in mobile devices. Four key features enable a low-power DNN online learning. First, a proposed processor is designed with a unified core architecture and it achieves $1.33\times $ higher throughput than the previous state-of-the-art DNN learning processor. Second, the new algorithms, binary feedback alignment (BFA), and dynamic fixed-point based run-length compression (RLC), are proposed and reduce power consumption through the reduction of external memory accesses (EMA). The BFA and dynamic fixed-point-based RLC reduce the EMA by 11.4% and 32.5%, respectively. Third, the new data feeding units, including an integral RLC (iRLC) decoder and a transpose RLC (tRLC) decoder, are co-designed to maximize throughput alongside the proposed algorithms. Finally, a dropout controller in this processor reduces redundant power consumption coming from the unified core and the data feeding architecture by the proposed dynamic clock-gating scheme. This enables the proposed processor to operate DNN online learning with 38.1% lower power consumption. Implemented with 65 nm CMOS technology, the 3.52 mm2 DNN online learning processor shows 126 mW power consumption and the processor achieves 30.4 frames-per-second throughput in the object tracking application.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Circuits and Systems I: Regular Papers
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.