Abstract
Recent advances in deep convolutional neural networks have led to significant success in many computer vision tasks, including edge detection. However, the existing edge detectors neglected the structural relationships among pixels, especially those among contour pixels. Inspired by human perception, this work points out the importance of learning structural relationships and proposes a novel real-time attention edge detection (AED) framework. Firstly, an elaborately designed attention mask is employed to capture the structural relationships among pixels at edges. Secondly, in the decoding phase of our encoder–decoder model, a new module called dense upsampling group convolution is designed to tackle the problem of information loss due to stride downsampling. And then, the detailed structural information can be preserved even it is ever destroyed in the encoding phase. The proposed relationship learning module introduces negligible computation overhead, and as a result, the proposed AED meets the requirement of real-time execution with only 0.65M parameters. With the proposed model, an optimal dataset scale F-score of 79.5 is obtained on the BSDS500 dataset with an inference speed of 105 frames per second, which is significantly faster than existing methods with comparable accuracy. In addition, a state-of-the-art performance is achieved on the BSDS500 (81.6) and NYU Depth (77.0) datasets when using a heavier model.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have