Abstract
The ubiquitous camera networks in the city brain system grow at a rapid pace, creating massive amounts of images and videos at a range of spatial-temporal scales and thereby forming the “biggest” big data. However, the sensing system often lags behind the construction of the fast-growing city brain system, in the sense that such exponentially growing data far exceed today’s sensing capabilities. Therefore, critical issues arise regarding how to better leverage the existing city brain system and significantly improve the city-scale performance in intelligent applications. To tackle the unprecedented challenges, we articulate a vision towards a novel visual computing framework, termed as <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">digital retina</i> , which aligns high-efficiency sensing models with the emerging Visual Coding for Machine (VCM) paradigm. In particular, digital retina may consist of video coding, feature coding, model coding, as well as their joint optimization. The digital retina is biologically-inspired, rooted on the widely accepted view that the retina encodes the visual information for human perception, and extracts features by the brain downstream areas to disentangle the visual objects. Within the digital retina framework, three streams, i.e., video stream, feature stream, and model stream, work collaboratively over the end-edge-cloud platform. In particular, the compressed video stream serves for human vision, the compact feature stream targets for machine vision, and the model stream incrementally updates deep learning models to improve the performance of human/machine vision tasks. We have developed a prototype to demonstrate the technical advantages of digital retina, and extensive experiments have been conducted to validate that it is able to effectively support the video big data analysis and retrieval in the intelligent city system. In particular, up to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$7000\times $ </tex-math></inline-formula> compression ratio could be realized for visual data compression while maintaining competitive performance with pristine signal in a series of visual analysis tasks.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Circuits and Systems for Video Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.