Read Textual features in Images and convert to Editable form by extended use of Artificial Neural Networks, Deep learning and Maximally Stable Extremal Region techniques

S Joseph Gladwin,C Vinoth Kumar

doi:10.1088/1742-6596/1921/1/012033

Abstract

Nowadays most of the information is stored in images and there is a need to convert this information into an editable format. The objective of this paper includes the development of a user-friendly tool to extract text from the video or to recognize well-written handwriting in a scanned document to an editable form. The proposed methodology is robust and can provide high-grade performance with layout distortion. The explication utilizes modalities like Optical Character Recognition (OCR), Artificial Neural Networks (ANN), Deep Learning and Maximally Stable Extremal Regions (MSER) techniques for text detection. The video extraction can be attained by using video to frame conversion and then to use the Structural Similarity Index Measure (SSIM). The prototype is developed using MATLAB and provided with a GUI, which is deployable on a workstation. The generated GUI is employed to define Region of Interest (ROI), specify the required text layout, highlighting the specific portions of the image, select the appropriate language and also export the text to a word document or notepad applications where it can be edited.

Full Text