Accurately delineating individual teeth in 3-dimensional tooth point clouds is an important orthodontic application. Learning-based segmentation methods rely on labeled datasets, which are typically limited in scale due to the labor-intensive process of annotating each tooth. In this article, we propose a self-supervised pretraining framework, named Geo-Net, to boost segmentation performance by leveraging large-scale unlabeled data. The framework is based on the scalable masked autoencoders, and 2 geometry-guided designs, curvature-aware patching algorithm (CPA) and scale-aware reconstruction (SCR), are proposed to enhance the masked pretraining for tooth point cloud segmentation. In particular, CPA is designed to assemble informative patches as the reconstruction unit, guided by the estimated pointwise curvatures. Aimed at equipping the pretrained encoder with scale-aware modeling capacity, we also propose SCR to perform multiple reconstructions across shallow and deep layers. In vitro experiments reveal that after pretraining with large-scale unlabeled data, the proposed Geo-Net can outperform the supervised counterparts in mean Intersection of Union (mIoU) with the same amount of annotated labeled data. The code and data are available at https://github.com/yifliu3/Geo-Net.
Read full abstract