Abstract

With the increasing popularity of applications such as unmanned driving, the ability of environment perception has become more and more important, and the most common expression of environment perception is semantic reconstruction. Therefore, more and more researchers are trying to synthesize the information from multiple sensors to achieve better semantic reconstruction effects. However, most of the current estimation methods (a). Too bulky to run in real-time (b). Failure to effectively use the information of a variety of different sensors (c). Failure to generate sufficient environmental perception information under limited computing power, such as semantic information and depth information. Therefore, this paper proposes a multi-modal joint estimation network for semantic reconstruction, which can solve the above problems. Our method takes RGB image and sparse depth as input. By adding multi-scale information to the neural network, it outputs semantic segmentation and depth recovery results simultaneously while maintaining light-weighted and real-time performance, then fuses both results in point clouds to get better environment perception ability. A large number of experiments show that our method has better performance than other methods in the same application scenario.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.