Abstract

In computer vision, automatically identifying and locating text in images or videos is an important task. Traditional text detection methods are not effective in detecting complex scenes in irregular rectangular areas such as text bending, spacing, and special shapes, which are mainly reflected in the fragmentation of text detection areas. In this paper, a text detection method based on edge attention mechanism is proposed to better adapt to complex scenes. The proposed method takes Encoder–Decoder as the core idea. First of all, an edge attention module is designed, including global attention and local attention. The global attention module is used to perceive the features of text regions and nontext regions, while the local attention module is used to learn the information of text boundaries. Then a multi-scale feature fusion process is designed, which can strengthen the edge information and key information of text regions. Finally, the model outputs probability maps and threshold maps, and generates high-precision binary maps of text regions. After experimental verification, the proposed method on the public data set significantly reduces the fragmentation of the detection area, improves the detection accuracy of the text area, and has better robustness for text detection scenes with unconventional rectangular areas.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call