Abstract

In recent years, deploying deep learning models on edge devices has become pervasive, driven by the increasing demand for intelligent edge computing solutions across various industries. From industrial automation to intelligent surveillance and healthcare, edge devices are being leveraged for real-time analytics and decision-making. Existing methods face two challenges when deploying machine learning models on edge devices. The first challenge is handling the execution order of operators with a simple strategy, which can lead to a potential waste of memory resources when dealing with directed acyclic graph structure models. The second challenge is that they usually process operators of a model one by one to optimize the inference latency, which may lead to the optimization problem getting trapped in local optima. We present MemoriaNova, comprising BTSearch and GenEFlow, to solve these two problems. BTSearch is a graph state backtracking algorithm with efficient pruning and hashing strategies designed to minimize memory overhead during inference and enlarge latency optimization search space. GenEFlow, based on genetic algorithms, integrates latency modeling and memory constraints to optimize distributed inference latency. This innovative approach considers a comprehensive search space for model partitioning, ensuring robust and adaptable solutions. We implement BTSearch and GenEFlow and test them on eleven deep-learning models with different structures and scales. The results show that BTSearch can reach 12% memory optimization compared with the widely used random execution strategy. At the same time, GenEFlow reduces inference latency by 33.9% in distributed systems with four-edge devices.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.