Abstract

In the recent period, many real-world applications and institutions generates a huge amount of data which is unstructured i.e., in the form of images containing data, receipts, invoices, forms, statements, contracts etc. This rich and detailed information presented in the text is of great significance in computer vision-based applications (driverless cars, assisting blind and visually impaired people, detecting labels and packages, automatic number plate recognition etc.). Recently, there has been a hike in the efforts, researches and progresses being done in this domain for its significance in data analysis and computer vision. Here has been a diversity of challenges in unstructured data like image sensor noise, different viewing angles, blur, lighting conditions, resolution, and non-planar object. Our objective for taking up this topic for research are i) to detect and recognize the text from the data ii) to handle diversity and variability of text in natural scene iii) to explore various datasets iv) to deal with various issues occurring in scene text detection. To tackle this problem, we propose a robust scene text detection and recognition method with adaptive text region representation using deep learning model open CV with EAST algorithm as detection pipeline and tesseract. The recurrent neural network-based adaptive text region representation is proposed for text region refinement, where a pair of boundary points are predicted each time step until no new points are found. In this way, text regions in an image are detected and represented with the adaptive number of boundary points.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call