Abstract

AbstractThis paper proposes a method for extracting text portions occurring in video images with high accuracy for reading by OCR. Past studies have produced methods of extracting text in a video image from its background by binarization based on a threshold, utilizing the fact that the intensity of the text is higher than that of the background. One method for determining the threshold is Shio's application of Otsu's method, assuming the distribution of two intensities of the background and characters in local blocks. However, methods based on various intensities of the background such as those of video images have the problem of not yielding a good threshold due to assumptions that are not necessarily valid. In addition, in reality, they cannot extract characters with accuracy sufficient for OCR readability because the intensity around the characters is not necessarily high due to the effects of shadowing, edge elimination, and signal conversion processing. Thus, this paper proposes a method of extracting only the text portions by robustly estimating the intensity distribution of the text portions, initially extracting high‐reliability areas as text portions, and extending the areas based on the estimated distribution. Experimental results show that the proposed method extracts text portions with higher accuracy and better OCR readability than the conventional methods. © 2005 Wiley Periodicals, Inc. Syst Comp Jpn, 36(9): 87–96, 2005; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/scj.10148

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.