Abstract

3D dense captioning provides descriptions for corresponding objects in 3D scenes represented as RGB-D scans and point clouds. However, when generating a description, existing methods select points randomly from a point cloud regardless of importance, which degrades the quality of the description by removing important points or including low-value points. To solve the above problem, we propose a recurrent point clouds selection (RPCS) method to mitigate descriptive deficiencies in 3D dense captioning by iteratively checking the caption results of the different point clouds. Our method is divided into two steps. On step 1, this work randomly selects cloud points and uses objectness score to evaluate the generated description. The objectness score indicates whether the proposed object is close to the ground truth; the higher the score, the closer the proposed object is to the ground truth in the positive value. On step 2, if the objectness score is lower than the threshold, step 1 is processed to generate another group of cloud points and evaluate the results. This loop stops when the objectness score is no longer reduced. The loop termination conditions are configurable according to the requirement of accuracy and processing time. As a result, our work can decrease the deficient descriptions and outperforms previous state-of-the-art methods by a large margin (6.58%~35.70% CiDEr, BLUE-4, METEOR, ROUGE improvement).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call