Abstract

ABSTRACTDeploying end‐to‐end ML applications on edge resources becomes a viable solution to achieve performance and data regulations. With the microservice architecture, these applications can scale dynamically, improving service availability under dynamic workloads. However, orchestrating multiple end‐to‐end ML applications within heterogeneous edge environments must deal with numerous challenges while sharing computing resources. Prevalent orchestration tools/frameworks supporting edge ML serving are inefficient in provisioning methods due to constrained resources, diverse resource demands and utilization patterns. In this work, we present a provisioning method to optimize resource utilization for end‐to‐end ML applications on a heterogeneous edge. By profiling all microservices within the application, we estimate scales and allocate them on desired hardware platforms with sufficient resources when considering their runtime utilization patterns. We also provide several practical analyses on runtime monitoring metrics to detect and mitigate resource contentions, guaranteeing performance. The experiments with three real‐world ML applications demonstrate the practicality of our method on a heterogeneous edge cluster of Raspberry Pis and Jetson Developer Kits.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.