Abstract

Abstract Optical character recognition systems convert printed or handwritten scripts into digital text formats like ASCII or UNICODE. Urdu-like script languages like Urdu, Punjabi and Sindhi are widely spoken languages of the world, especially in Asia. An enormous amount of printed and handwritten text of such languages exist, which needs to be converted into computer-understandable formats for knowledge extraction. In this study, extreme learning machine’s (ELM’s) most recently proposed variant called deep extreme learning machine (DELM)-based optical character recognition (OCR) system is proposed to enhance Urdu-like script language’s character recognition rate. The proposed DELM-based character recognition model is optimizing the OCR process by reducing the overhead of Pre-processing, Segmentation and Feature Extraction Layer. The proposed system evaluations accomplished 98.75% training accuracy with 1.492 × 10−3 RMSE and 98.12% testing accuracy with 1.587 × 10−3 RMSE, with six DELM hidden layers. The results show that the proposed system has attained the foremost recognition rate as compared to any previously proposed Urdu-like script language OCR system. This technique is applicable for machine-printed text and fractionally useful for handwritten text as well. This study will aid in the advancement of more accurate Urdu-like script OCR’s software systems in the future.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.