Abstract

To address the low segmentation accuracy caused by the rich glyph styles of ancient Chinese characters and the complex layout of ancient Chinese books, which affects the retrieval and recognition results, an algorithm for the layout image analysis of ancient Chinese books and Chinese character image segmentation is proposed. The initial segmentation results were obtained through the projection method of the layout of ancient Chinese books, and the connected component analysis of the above results was carried out to determine the rough divided blocks of under-segmentation and over-segmentation. Considering under-segmentation of adhesive Chinese characters, the improved K-means clustering method was used to segment adhesive blocks to obtain single-character images. To address the over-segmentation of character components separation, a method based on interval-valued hesitant fuzzy set is proposed. This method analyzed the features of the connected component in the block, characterized the over-segmentation connected component. The hesitant fuzzy distances between other connected components and the standard merge evaluation interval number were calculated in sequence. The connected component with the smallest distance was preferentially merged with the over-segmentation connected component until no over-segmentation connected component remained in the block. The experimental segmentation accuracy was 89.94%.

Highlights

  • From continuous advancements in ancient Chinese book research, computer technology is popularly used to address problems

  • Because ancient Chinese books were handwritten with the complex layout, and rich glyph styles of ancient Chinese characters, it is necessary to analyze the layout image of ancient Chinese books

  • Tian et al.: Ancient Chinese Character Image Segmentation Based on Interval-Valued Hesitant Fuzzy Set in the segmentation process compared to others

Read more

Summary

Introduction

INDEX TERMS Ancient Chinese books, Chinese character segmentation, interval-valued hesitant fuzzy set, k-means clustering, layout analysis, layout image. X. Tian et al.: Ancient Chinese Character Image Segmentation Based on Interval-Valued Hesitant Fuzzy Set in the segmentation process compared to others. X. Tian et al.: Ancient Chinese Character Image Segmentation Based on Interval-Valued Hesitant Fuzzy Set FIGURE 2.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call