Multi-Script Text Detection from Image Using FRCNN

Mukesh M Goswami,Suman Mitra,Tanvi Goswami,Nidhi J Dadiya

doi:10.1142/s2717554522500035

Abstract

Textual information is the most common type of way by which we can determine what text/texts we are looking for. In order to retrieve text from images the first and foremost step is text detection from the image. Text detection has a wide range of applications such as translation, smart car driving system, information retrieval, indexing of multimedia archives, sign board reading, and countless. Multilingual text detection from images adds an extra complication to a computer vision problem. As India is a multilingual country and therefore multi-script texts can be found almost everywhere. A multi-script text differs in terms of formats, strokes, width, and height. Also, universal features for such an environment are unknown and difficult to determine as well. Therefore, detecting multi-script text from images is an important yet unsolved problem. In this work, we proposed a faster RCNN-based method for detecting English, Hindi, and Gujarati text from Images. Faster RCNN is the state-of-the-art approach for object detection. As it works for objects which are of large size and texts are of smaller size, the parameters are tuned to meet the objective of multi-script text detection. The dataset is created by collecting images as there is no standard dataset available that includes English, Gujarati, and Hindi texts in the public domain.

Full Text