Abstract

Most conventional characters extraction methods include binarization (background determination), region segmentation, and region identification. Incorrect binarization results adversely influence the segmentation and identification results. This can be a problem when color documents are printed with different background color regions as the binarization will not have effective threshold results and subsequent segmentation and identification steps will not work properly. Conventional region segmentation methods are time-consuming for large document images. Conventional region identification methods are applied for the preceding segmentation results, using a bottom-up method. This study presents an intelligent method to solve these problems, which integrates background determination, region segmentation, and region identification to extract characters in color documents with highlight regions. The results demonstrate that the proposed method is more effective and efficient than other methods in terms of binarization results, extraction results, and computational performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.