Abstract

In recent years, natural scene text detection gains increasing attention because it plays an important role in many computer related techniques. In this paper, we propose a text detection method consisting of two major steps: connected components (CCs) extraction and non-text filtering. For CCs extraction, a multi-scale adaptive color clustering approach is proposed, which can extract text from images in different color complexities and is robust to contrast variation. For non-text filtering, we combine text covariance descriptor (TCD) with histogram of oriented gradients (HOG) to construct feature vectors and use them to distinguish text from background at character and text line levels. Besides, a new text line generation strategy combining both refined and unrefined CCs is applied, which can retrieve some mis-eliminated characters and generate more integrated text lines. Experiments are conducted on two publicly available datasets, the ICDAR 2013 and the ICDAR 2011 datasets, the obtained F-measures on which are 0.76 and 0.75, respectively. Comparative results with some state-of-the-art text detection algorithms demonstrate that the proposed method achieves competitive performance on text detection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call