Abstract

Optical character recognition (OCR) is a technology that allows you to convert different types of documents or images into searchable, editable, and analyzable data. The current work on Sanskrit Character Recognition from Images of Text Documents is one of the most difficult due to similarities in the forms of unique letters, script complexity, non-forte in the representation, and a vast number of symbols. The Devanagari script is used to write the Sanskrit language. There are a variety of approaches for recognizing characters in a scanned image. The present research initiatives highlight the importance and needs of efforts in recognition of printed and handwritten documents written in Sanskrit language. This paper is aims at reviewing the state of various scripts in use including those from medieval to present era and explores the prospective of digital recognition of printed texts and thereby pointing towards futuristic trends in developing restoration software for Sanskrit scripts. Challenge is due to the number of languages and their diverse scripts. The scarcity of digitized linguistic resources makes the task a tougher one. The paper also highlights on the characteristics and challenges of recognition of scripts of Sanskrit origin. Largely the digital recognition is limited to simple numerals and isolated characters. In addition, this review article serves the purpose an optical character recognition (OCR) system that enables to analyse the word recognition and translate various types of Sanskrit documents or images into text using deep learning architectures which include Convolutional Neural Network (CNN) and Bidirectional long-short term memory (Bidirectional LSTM).

Highlights

  • Sanskrit was used to write the majority of India's finest literary works

  • The Vedas, which were written in sanskrit, represent the spirit of Indian culture and history

  • The essential ideas of Buddhism were recorded in sanskrit

Read more

Summary

Introduction

Sanskrit was used to write the majority of India's finest literary works. The Vedas, which were written in sanskrit, represent the spirit of Indian culture and history. The presence of old scientific and mathematical study work published in Sanskrit is gaining prominence in many academic groups. Scientists from all across the world are devoting more and more effort to deciphering these historic research texts. Sanskrit texts include a wealth of information about science, mathematics, Hindu mythology, Indian civilisation, and culture. A major barrier is Bhavesh Kataria, Dr Harikrishna B. March-April-2019 ; 5(2) : 1362-1383 the unavailability of correctly digitised and labelled versions of Sanskrit texts. It is critical to digitise such historical texts, which are valuable for study and represent a vital element of India's culture and tradition

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.