Abstract
Silicon retinas, also known as Dynamic Vision Sensors (DVS) or event-based visual sensors, have shown great advantages in terms of low power consumption, low bandwidth, wide dynamic range and very high temporal resolution. Owing to such advantages as compared to conventional vision sensors, DVS devices are gaining more and more attention in various applications such as drone surveillance, robotics, high-speed motion photography, etc. The output of such sensors is a sequence of events rather than a series of frames as for classical cameras. Estimating the data rate of the stream of events associated with such sensors is needed for the appropriate design of transmission systems involving such sensors. In this work, we propose to consider information about the scene content and sensor speed to support such estimation, and we identify suitable metrics to quantify the complexity of the scene for this purpose. According to the results of this study, the event rate shows an exponential relationship with the metric associated with the complexity of the scene and linear relationships with the speed of the sensor. Based on these results, we propose a two-parameter model for the dependency of the event rate on scene complexity and sensor speed. The model achieves a prediction accuracy of approximately 88.4% for the outdoor environment along with the overall prediction performance of approximately 84%.
Highlights
Conventional video cameras capture video, via temporal sampling, in a series of separate frames or images, whose raw pixel values are processed
In order to study and compare the correlation performance of different scene complexity metrics on the event rate, we considered similar motion speed of the vision sensor for every type of scene so that the event rate only depended on the scene complexity
The data rate output by the event-based vision sensors depends on the type of scene and motion speed
Summary
Conventional video cameras capture video, via temporal sampling, in a series of separate frames or images, whose raw pixel values are processed. In the conventional video sensing approach, the frames are typically acquired with a fixed frame rate, regardless of the scene content and complexity. This approach generates a substantial load in terms of energy consumption, data management and the transmission system. The mammalian brain only receives new information from the eyes when something in a scene changes This significantly reduces the amount of information delivered to the biological brain, but is sufficient to identify the surrounding changes and conditions
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.