Abstract

Bio-inspired spike cameras, offering high temporal resolution spike streams, have brought a new perspective to address common challenges (e.g.,high-speed motion blur) in depth estimation tasks. In this paper, we propose a novel problem setting, spike-based stereo depth estimation, which is the first trail that explores an end-to-end network to learn stereo depth estimation with transformers for spike cameras, named Spike-based Stereo Depth Estimation Transformer (SSDEFormer). We first build a hybrid camera platform and provide a new stereo depth estimation dataset (i.e.,PKU-Spike-Stereo) with spatiotemporal synchronized labels. Then, we propose a novel spike representation to effectively exploit spatiotemporal information from spike streams. Finally, a transformer-based network is designed to generate dense depth maps without a fixed-disparity cost volume. Empirically, it shows that our approach is extremely effective on both synthetic and real-world datasets. The results verify that spike cameras can perform robust depth estimation even in cases where conventional cameras and event cameras fail in fast motion scenarios.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.