Abstract

The development of cloud and edge computing has enabled the easy access of artificial intelligence (AI) services for massive heterogeneous and resource-constrained devices. Particularly, computation-intensive AI services can be orchestrated and deployed in the cloud or edge according to varying performance and cost requirements. Nonetheless, the improved accessibility of deep learning (DL) model variants and the evolving of computational intelligence paradigms pose great challenges for orchestrating large-scale DL inference services in the cloud-edge continuum. Focusing on cloud or edge-based deployment, existing work on multi-variant service orchestration often has a limited solution space of deployment plans. To address this limitation, we first propose a novel Multi-Paradigm Deployment Model (MPDM) for service orchestration, which not only considers the model variants but also allows the co-existence of multiple paradigms for large-scale inference service deployment. The service deployment in the MPDM model is then formulated as a multi-objective optimization problem of seeking a better trade-off among the system accuracy, service scale and deployment cost. To solve the multi-objective optimization, we further propose a weighted metric based constructive heuristic algorithm (WCH), which can efficiently obtain an approximately optimal Pareto frontier. Extensive experimental results have validated the effectiveness and efficiency of WCH, and revealed the impacts of both multi-paradigm deployment and edge-cloud collaborative intelligence (ECCI) paradigm on large-scale DL serving systems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.