Abstract

To maximize the cost-effectiveness of neural network (NN) accelerators, architects are actively developing single-chip accelerators which can execute many NNs simultaneously. However, previous approaches fail to achieve full performance potential by exploiting only spatial or temporal resource sharing (SS or TS). They also do not consider memory management that can significantly affect performance. This limitation leads to the dire need for a new multi-NN accelerator taking both opportunities with careful memory management. But, it is extremely challenging to design an ideal spatio-temporal sharing accelerator because it requires (1) an algorithm that determines the degree of SS/TS in large exploration spaces, (2) a new STS-enabled accelerator devised with diverse design points, and (3) carefully-designed memory management that minimizes resource contention during numerous data transfers upon reconfiguration. To this end, we propose STfusion, a fast and flexible multi-NN execution architecture. First, STfusion partitions an accelerator into multiple smaller TS-enabled accelerators. Second, STfusion dynamically fuses small accelerators to adjust the accelerator sizes. Third, STfusion manages on-chip buffer in a page-granularity for stall-free data transfers. Lastly, STfusion provides an algorithm that determines the degree of SS/TS to achieve high throughput while satisfying QoS goals. Our evaluation shows that STfusion significantly outperforms state-of-the-art multi-NN accelerators.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.