Abstract

Character segmentation plays an important role in optical character recognition (OCR). Due to the limitations of feature representation, traditional image analyzing based methods cannot well segment characters with connected or broken strokes, especially for the Chinese characters which usually have complex structures. To solve this issue, this paper proposes a novel segmentation model based on fully convolutional neural networks (FCN). The model first uses convolutional neural networks to extract spatial features, then shares them throughout the whole model. Two FCNs are used to extract character information to form a score map. Finally, character features are reused to adjust the accurate segmentation points in the score map. What’s more, to strengthen the ability of feature representation, a novel compound character feature which can well describe the characters’ outline is also proposed. The proposed method is validated on two datasets: GBSD and CASIA-HWDB-MT, against the methods proposed in the literature. Experimental results show that the proposed model outperforms state-of-the-art methods.

Highlights

  • Character segmentation is an important task in optical character recognition (OCR) process

  • In 2015, Long et al [20] firstly regarded the text detection problem as a semantic segmentation problem, and proposed using fully convolutional neural networks (FCN), which can classify each pixel in the image, taking consideration of characters’ detail

  • We propose a Chinese character segmentation method (CCSeg)

Read more

Summary

INTRODUCTION

Character segmentation is an important task in optical character recognition (OCR) process. Compared to segmentation method based on image analysis, these methods have a strong ability of feature learning These methods take segmentation as a detection task, usually used for text line segmentation [14], cannot segment Chinese characters well. In 2015, Long et al [20] firstly regarded the text detection problem as a semantic segmentation problem, and proposed using fully convolutional neural networks (FCN), which can classify each pixel in the image, taking consideration of characters’ detail This method still fails to obtain fine. We regard character segmentation as an independent task of semantic segmentation in images, and propose an improved full convolutional neural network model. Accurate segmentation points are marked in F with CMF and CNF

COMPOUND CHARACTER FEATURE EXTRACTION
CCSeg BASED ON COMPOUND CHARACTER FEATURE
TRAINING Training settings
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call