Abstract
While video sensing, performed by resource-constrained pervasive devices, is a key enabler of many machine intelligence applications, the high energy and bandwidth overheads of streaming video transmission continue to present formidable deployment challenges. Motivated by the recent advancements in deep learning models, this paper proposes the usage of a Generative Network-based technique for resource-efficient streaming video compression and transmission. However, we empirically show that while such generative network based models offer superior compression gains compared to H.265, additional DNN optimization mechanisms are needed to substantially reduce their encoder complexity. Our proposed optimized system, dubbed Pr-Ge-Ne , adopts a carefully pruned encoder-decoder DNN, on the pervasive device, to efficiently encode a latent vector representation of intra-frame relative motion, and then uses a generator network at the decoder to reconstruct the frames by overlaying such motion information to ‘animate’ an initial reference frame. By evaluating three representative streaming video datasets, we show that Pr-Ge-Ne achieves around \(6\) - \(10\) fold reduction in video transmission rates (with negligible impact on the accuracy of machine perception tasks) compared to H.265, while simultaneously reducing latency and energy overheads on a pervasive device by \(\sim\) 90% and 15-50%, respectively.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have