Abstract
Deep neural networks (DNNs) have been increasingly used in recent years to achieve higher inference accuracy; however, implementing deeper networks in edge-computing environments can be challenging. Current methods for accelerating CNN inference focus on finding a trade-off between accuracy and latency under an assumed uniform distribution, ignoring the impact of real-world data distributions. To address this, we propose the Coarse-to-Fine (C2F) framework, which includes a C2F model and a corresponding C2F inference architecture to better exploit distributional differences in the edge environment. The C2F model is derived from various adaptations of Convolutional Neural Networks (CNNs). By deconstructing the original CNNs into multiple smaller models, the C2F model increases memory consumption within an acceptable range to improve inference speed without sacrificing accuracy. The C2F architecture deploys C2F models more logically in complex edge environments, reducing inference costs and memory consumption. We conduct experiments on the CIFAR dataset with different backbone networks and show that our C2F framework can simultaneously reduce latency and improve accuracy in complex edge environments.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.