Abstract
Active interaction with environments is one of the striking characteristics of robotic active vision, which allows robots to move to facilitate visual tasks. Recently, several embodied AI platforms have been proposed as the synthetic environments to study robotic active vision, without needing to interact in real world. However, by using synthetic data, model trained on these platforms will suffer performance degradation when applied in reality. In this letter, a real 3D embodied dataset is proposed for robotic active visual learning. The proposed dataset consists of real point cloud data densely collected in 7 real-world indoor scenes. In our embodied dataset, researchers are able to simulate the movements and interactions of robots in indoor environments and obtain real visual data, which will not lead to performance degradation in reality. Furthermore, we proposed a 3D divergency policy that can guide robots to move and collect data to improve visual performance in novel environments. The proposed policy is designed following a simple fact: a good 3D detector should produce consistent 3D detection results for the same object from different viewpoints. Therefore, our policy encourages the robot to explore the area where the detector generates different 3D bounding boxes for the same object and helps the robot improve its visual performance in novel scenes.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have