Abstract

The booming of Convolutional Neural Networks (CNNs) has empowered lots of computer-vision applications. Due to its stringent requirement for computing resources, substantial research has been conducted on how to optimize its deployment and execution on resource-constrained devices. However, previous works have several weaknesses, including limited support for various CNN structures, fixed scheduling strategies, overlapped computations, high synchronization overheads, etc. In this article, we present DeepSlicing, a collaborative and adaptive inference system that adapts to various CNNs and supports customized flexible fine-grained scheduling. As a built-in functionality, DeepSlicing has supported typical CNNs including GoogLeNet, ResNet, etc. By partitioning both model and data, we also design an efficient scheduler, Proportional Synchronized Scheduler (PSS), which achieves the trade-off between computation and synchronization. Based on PyTorch, we have implemented DeepSlicing on the testbed with real-world edge settings that consists of 8 heterogeneous Raspberry Pi's. The results indicate that DeepSlicing with PSS outperforms the existing systems dramatically, e.g., the inference latency and memory footprint are reduced up to 5.79× and 14.72×, respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.