Abstract

Deep neural networks (DNNs) have been increasingly used in recent years to achieve higher inference accuracy; however, implementing deeper networks in edge-computing environments can be challenging. Current methods for accelerating CNN inference focus on finding a trade-off between accuracy and latency under an assumed uniform distribution, ignoring the impact of real-world data distributions. To address this, we propose the Coarse-to-Fine (C2F) framework, which includes a C2F model and a corresponding C2F inference architecture to better exploit distributional differences in the edge environment. The C2F model is derived from various adaptations of Convolutional Neural Networks (CNNs). By deconstructing the original CNNs into multiple smaller models, the C2F model increases memory consumption within an acceptable range to improve inference speed without sacrificing accuracy. The C2F architecture deploys C2F models more logically in complex edge environments, reducing inference costs and memory consumption. We conduct experiments on the CIFAR dataset with different backbone networks and show that our C2F framework can simultaneously reduce latency and improve accuracy in complex edge environments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call