Abstract In this paper, we focus on text detection in natural scene images which is conducive to content-based wild image analysis and understanding. This task is still an open problem and usually includes two key issues: text candidate extraction and verification. For text candidate extraction, we introduce a color prior to guide the character candidate extraction by Maximally Stable Extremal Region (MSER). The principle of color prior acquirement is to obtain stroke-like textures with modified Stroke Width Transform (SWT), which is based on segmented edges. For text verification, the ideology of deep learning is adopted to distinguish text/non-text candidates. To improve classification accuracy, the results of specific task CNNs are fused. The proposed framework is evaluated on the ICDAR 2013 Robust Reading Competition database. It achieves F-score at 85.87%, which are superior over several state-of-the-art text detection methods.
Read full abstract