Video streaming perception ability is critical for AI applications on resource-constrained devices (agents), which prefers to offload video streams from devices to edge servers for real-time inference by deep neural networks (DNNs). Meanwhile, the multi-agent system (MAS) community is attempting to run DNNs on multiple cooperative agents to enable improved swarm intelligence-based tasks (e.g., drone swarm intelligence, self-driving fleet collaboration, and multi-agent robot cooperation). However, transferring video streaming perception capability from single-agent systems to MASs is extremely difficult due to spontaneous competition-induced trade-offs between the desired goals of accuracy, consistency, and capacity, which are three critical but conflicting measuring indexes. In this paper, we present the design and implementation of MASSIVE, an edge-assisted cooperative multi-agent video streaming perception system that simultaneously achieves all three desired goals. In our design, we consider the performance characteristics of video streaming perception and the insight of its periodic offloading pattern. On this basis, we develop a Pareto improvement scheduler to eliminate spontaneous competition among agents, allowing multi-objective optimization to achieve an ideal Pareto optimal state. Finally, we propose a virtual traffic shaper based on the mainstream 802.11 MAC protocol to ensure deterministic periodic video stream offloading in an uncertain wireless network. Our experiments demonstrate that MASSIVE achieves 122.7% accuracy and 1.8x capacity compared to the closest baseline on multiple actual cooperative vision tasks with even better consistency, and achieves an ideal Pareto optimal state in a wireless environment.
Read full abstract