Historical Corpora Correlation based on RNN and DCNN

Wei Lin,Zhaoquan Lin

doi:10.1088/1742-6596/1873/1/012048

Historical Corpora Correlation based on RNN and DCNN

Wei Lin, Zhaoquan Lin

Open Access

https://doi.org/10.1088/1742-6596/1873/1/012048

Copy DOI

Journal: Journal of Physics: Conference Series	Publication Date: Apr 1, 2021
License type: cc-by

Affiliation: Fuzhou University

#Recurrent Neutral Network #Deep Convolutional Network + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Correcting historical corpora in digital version is a crucial task for the historical research, however, scan quality, book layout, visual character similarity can affect the quality of the recognizing. OCR is at the forefront of digitization projects for cultural heritage preservation. The main task is to identify characters from their visual form into their textual representation. In this paper, we propose a model combining recurrent neutral network(RNN) and deep convolutional network(DCNN) to correct OCR transcription errors. The experiment on a historical book corpus in German language shows that the model is very robust in capturing diverse OCR transcription errors greatly.

Full Text