A Deep Learning Based Offline Optical Character Recognition Model for Printed Ottoman Turkish

Ahmed Al-Khaffaf,Ümit Ati̇la

doi:10.47577/technium.v18i.10252

Abstract

Developing efficient optical character recognition (OCR) systems for printed Ottoman text is a problem since current OCR models created for Arabic have restrictions that make it difficult to be performed. The performance of these models has been shown to be low when used for the recognition of Ottoman text. It has also been shown that these models that have been subjected to specialized training on Ottoman text have produced results that are not sufficient. In this study, an analysis of printed Ottoman Turkish documents in the Matbu font is conducted using a deep learning model that is proposed. Through the use of an end-to-end trainable architecture that integrates convolutional neural networks (CNNs) with bidirectional long short-term memory (BiLSTM) units, this study proposes an efficient solution to the Ottoman optical character recognition (OCR) issue. Experimental results show that the proposed model achieved overall scores for accuracy, sensitivity, and precision of 99.6%, 87.1%, and 93.3% on the test dataset respectively.

Full Text