A Combination of DWT CLAHE and Wiener Filter for Effective Scene to Text Conversion and Pronunciation

Saeed Mian Qaisar,Noofa Hammad,Raviha Khan

doi:10.1007/s42835-020-00461-2

Abstract

An effective scene to text conversion and its pronunciation is realized. An intelligent combination of Discrete Wavelet Transform (DWT), Contrast Limited Adaptive Histogram Equalization (CLAHE), Wiener filter and adaptive weighted average is utilized for the image enhancement. Subsequently, the Maximally Stable Extremal Region (MSER) is used to detect the text regions. Afterward, the geometrical and contour based approaches filter out the non-text MSERs. The connected component concept is used to group the text candidates. In next step the Optical Character Recognition (OCR) recognizes the text. The Microsoft speech to text synthesizer pronounces the extracted text. The system applicability is tested by using the standard robust reading competition dataset. The designed method secures 93% precision in text segmentation and 89.9% precision in end-to-end recognition.

Full Text