Abstract

Abstract. In the wake of the success of Deep Learning Networks (DLN) for image recognition, object detection, shape classification and semantic segmentation, this approach has proven to be both a major breakthrough and an excellent tool in point cloud classification. However, understanding how different types of DLN achieve still lacks. In several studies the output of segmentation/classification process is compared against benchmarks, but the network is treated as a “black-box” and intermediate steps are not deeply analysed. Specifically, here the following questions are discussed: (1) what exactly did DLN learn from a point cloud? (2) On the basis of what information do DLN make decisions? To conduct such a quantitative investigation of these DLN applied to point clouds, this paper investigates the visual interpretability for the decision-making process. Firstly, we introduce a reconstruction network able to reconstruct and visualise the learned features, in order to face with question (1). Then, we propose 3DCAM to indicate the discriminative point cloud regions used by these networks to identify that category, thus dealing with question (2). Through answering the above two questions, the paper would like to offer some initial solutions to better understand the application of DLN to point clouds.

Highlights

  • Inspired by human brains, Deep Learning (DL) is a subset of Machine Learning techniques that teaches computers to do what comes naturally to humans: learn from experience

  • While various kinds of Deep Learning Networks (DLN) are continuously developed and improved in either 2D image analysis and 3D point cloud classification, understanding of how these results are achieved has not been paid too much attention. This question has sparked the interest of various researchers and in response several approaches are emerging as ways of understanding DLNs by using visualization techniques

  • We would like to emphasize that while 3D Classification Activation Map (3DCAM) is not a novel technique that we propose here, the observation that it can be applied for various 3D point cloud tasks rather than images or only PointNet is, to the best of our knowledge, unique to our work

Read more

Summary

INTRODUCTION

Deep Learning (DL) is a subset of Machine Learning techniques that teaches computers to do what comes naturally to humans: learn from experience. There is still limited understanding of how DLN and their intermediate layers achieve their final outputs in the domain of 3D point clouds This type of three-dimensional spatial data sets have a totally different structure compared to images. The understanding of point cloud classification based on DLNs is an open issue and current knowledge about it is deeply unsatisfactory To conduct such a quantitative explanation of how these networks work, in this paper we will investigate the visual interpretability for the decision-making process of these architectures and discuss two questions: 1. By means of reconstruction-based feature visualizations, the understanding of what DLNs learn in their intermediate layers is investigated This method allows us to observe the evolution of features during training. A 3D-CAM attribution visualization, which allows us to observe what information from point clouds pushes the decision-making process in DLNs

RELATED WORK
Feature Visualization
Attribution Visualization
METHOD
EXPERIMENTS
CONCLUSIONS
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.