Abstract

For the word segmentation of handwritten Uyghur text images, this paper proposes a segmentation method based on clustering algorithm. In this paper, firstly, the pre-processed text line images are projected to the vertical direction, which can get the initial probable segmentation points and record the blank spaces and text length between connected domains. By using clustering algorithm, the blank spaces are classified into two categories: 'within word' gap and 'between words' gap. Then the first mergence is completed according to the clustering results. For the existed phenomenon of over segmentation, one merging method based on threshold is proposed through the combination of text region length and blank space length so that the final segmentation points are obtained. And the experimental results show that this method can effectively solve the word segmentation problem in the handwritten text images.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call