Abstract

Deep neural networks (DNNs) have seen resurgent attraction to be implemented in edge applications. However, such implementations are not easy to achieve because execution of DNNs often require more resources than those provided by individual edge devices. On the other hand, relying on model-level distribution methods to implement a DNN on connected edge devices leads to costly communication overheads. To utilize available in-the-edge resources with less communication overhead, we propose using edge-tailored models comprised of nearly-independent narrow DNNs, the inference of which are accelerated using small cost-efficient RISC-based engines. We implement these engines on PYNQ boards as a platform that mimics the limited resources of edge devices. We create the narrow DNNs based on the available resources of PYNQ boards, and allocate each narrow DNN to one engine, implemented in an FPGA. We compare the communication overhead of our implantation against the state-of-the-art model-level distribution methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call