Deep learning for 3-D point cloud perception has been a very active research topic in recent years. A current trend is toward the combination of the semantically strong and the fine-grained information from different scales of intermediate representations to boost network generalization power and robustness against scale variation. One prominent challenge is how to effectively conduct the allocation of multiple scales of information. In this letter, we propose a module, named adaptive pyramid context fusion (APCF), to adaptively capture scales of contextual information from a multiscale feature pyramid for the point cloud. The APCF module reweights and aggregates the features from different levels in the feature pyramid via a softmax attention strategy. The allocation of information is adaptively conducted level by level from bottom to up first and then from top to bottom. To ensure both effectiveness and efficiency, we propose a multiscale context-aware network APCF-Net through applying our proposed APCF to the PointConv architecture. Experiments demonstrate that APCF-Net surpasses its vanilla counterpart by a large margin both in effectiveness and efficiency. Especially, APCF-Net outperforms state-of-the-art approaches on 3-D object classification and semantic segmentation task, with the overall accuracy of 93.3% on ModelNet40 and mIoU of 63.1% on ScanNet V2 online test.
Read full abstract