Abstract

Vision sensors in Internet of Things (IoT)-connected smart cities play a vital role in the exponential growth of video data, thereby making its analysis and storage comparatively tough and challenging. Those sensors continuously generate data for 24 hours, which requires huge storage resources, dedicated networks for sharing with data centers, and most importantly, it makes browsing, retrieval, and event searching a difficult and time-consuming job. Video summarization (VS) is a promising direction toward a solution to these problems by analyzing the visual contents acquired from a vision sensor and prioritizing them based on events, saliency, person's appearance, and so on. However, the current VS literature still lacks focus on resource-constrained devices that can summarize data over the edge and upload it to data repositories efficiently for instant analysis. Therefore, in this article, we carry out a survey of functional VS methods to understand their pros and cons for resource-constrained devices, with the ambition to provide a compact tutorial to the community of researchers in the field. Further, we present a novel saliency-aware VS framework, incorporating 5G-enabled IoT devices, which keeps only important data, thereby saving storage resources and providing representative data for immediate exploration. Keeping privacy of data as a second priority, we intelligently encrypt the salient frames over resource-constrained devices before transmission over the 5G network. The reported experimental results show that our proposed framework has additional benefits of faster transmission (1.8~13.77 percent frames of a lengthy video are considered for transmission), reduced bandwidth, and real-time processing compared to state-of-the-art methods in the field.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call