Abstract

In recent years, the advances in deep learning (DL) technology have greatly improved artificial intelligence (AI)-related research and services. Among them, real-time object recognition using network cameras has become an important technology for various applications. A large number of network cameras are being deployed for real-time object detection using DL models at GPU-based edge servers. A significant issue for widely deploying this type of systems is low-cost network deployment and low-latency data transmission. A promising option for efficiently accommodating numerous network cameras is time- and wavelength-division multiplexed passive optical network (TWDM-PON), which has prevailed in optical access network systems. The key challenge in a GPU-based inference system via TWDM-PON is to optimally allocate upstream wavelengths and bandwidths to enable real-time inference. To address this problem, this article proposes the concept of an inference system in which many cameras upload image data to a GPU-based edge server via TWDM-PON. A real-time resource allocation scheme for TWDM-PON is also proposed to guarantee low latency and time-synchronized data arrival at the edge. We formulated the wavelength and bandwidth allocation problem as a Boolean satisfiability problem (SAT) for fast computation. The performance of the proposed method is verified by computer simulation. The proposed scheme contributes to the increase in the batch size of arriving data at the edge server while ensuring low-latency data transmission. As a consequence, the computational efficiency of the GPU-based inference server is greatly improved by the increase in the batch size of data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call