Boundary detection, as a fundamental task for computer vision applications, plays an important role in many tasks such as image deblurring, semantic segmentation, camouflaged object detection, and salient object detection. The thickness problem of boundary prediction is so prevalent in existing methods that NMS has to be used as a post-processing method to thin the prediction boundaries. However, we believe that additional post-processing does not constitute an end-to-end behavior, and the results obtained do not reflect the true boundary localization capabilities of the algorithm. In comparison to the superior end-to-end approach, NMS reveals notable limitations in rectifying erroneous predictions and achieving high-quality results similar to the ground truth. Furthermore, the use of NMS will inevitably compromise the prediction of fragile boundaries. Therefore, we remove the NMS step from the boundary detection process and generalize two reasons for the detector’s poor ability to localize crisp boundaries: boundary blurring and static cues. Aliasing region loss (Lar) and hollow region loss (Lhr) are introduced to mitigate the occurrence of false positives in various types of regions, thereby addressing the issue of boundary blurring. For static cues, we propose a dimension attention (DimAttn) module which can realize multidimensional and multi-range clew interaction. In fact, our method can be plug-and-play in any existing boundary detection model. Extensive experiments have demonstrated that the boundary prediction results of the proposed framework are crisper and exhibit remarkable superiority in detecting fragile boundaries compared with scenarios utilizing NMS.
Read full abstract