The ubiquitous camera networks in the city brain system grow at a rapid pace, creating massive amounts of images and videos at a range of spatial-temporal scales and thereby forming the “biggest” big data. However, the sensing system often lags behind the construction of the fast-growing city brain system, in the sense that such exponentially growing data far exceed today’s sensing capabilities. Therefore, critical issues arise regarding how to better leverage the existing city brain system and significantly improve the city-scale performance in intelligent applications. To tackle the unprecedented challenges, we articulate a vision towards a novel visual computing framework, termed as <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">digital retina</i> , which aligns high-efficiency sensing models with the emerging Visual Coding for Machine (VCM) paradigm. In particular, digital retina may consist of video coding, feature coding, model coding, as well as their joint optimization. The digital retina is biologically-inspired, rooted on the widely accepted view that the retina encodes the visual information for human perception, and extracts features by the brain downstream areas to disentangle the visual objects. Within the digital retina framework, three streams, i.e., video stream, feature stream, and model stream, work collaboratively over the end-edge-cloud platform. In particular, the compressed video stream serves for human vision, the compact feature stream targets for machine vision, and the model stream incrementally updates deep learning models to improve the performance of human/machine vision tasks. We have developed a prototype to demonstrate the technical advantages of digital retina, and extensive experiments have been conducted to validate that it is able to effectively support the video big data analysis and retrieval in the intelligent city system. In particular, up to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$7000\times $ </tex-math></inline-formula> compression ratio could be realized for visual data compression while maintaining competitive performance with pristine signal in a series of visual analysis tasks.