Abstract

In view of the large scale of Ancient Chinese books and the laborious problem of digitalization by manual input method, the ancient books digitalization model based on CRNN is proposed to realize the efficient text input of ancient books. In this paper, we use Handwritten Characters data set of Chinese Ancient Handwritten Characters Database (CASIA-AHCDB) to generate vertical text image data set of variable-length Handwritten Characters. Finally, after iterative training, the accuracy based on character (similarity measured by text editing distance) reached 97.54% in the training set and 96.58% in the test set. The method can be used to identify and proofread ancient books.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call