Abstract

extraction in an image is a challenging task in the computer vision. Text extraction plays an important role in providing useful and valuable information. This paper discusses various approaches such as Adaptive Local Connectivity Map (ALCM), Expectation Maximization (EM), Maximization Likelihood (ML), Markov Random Field (MRF), Spiral Run Length Smearing Algorithm (SRLSA), Curvelet transform etc. for extracting text from scanned book covers, journals, multi-color document, handwritten document, ancient document and newspaper document images. Text line segmentation is a major component for document image analysis. Text in documents depend upon various factors such as language, styles, font, sizes, color, background, orientation, fluctuating text lines, crossing or touching text lines. This paper provides performance comparison of several existing methods suggested by researchers in document text extraction on the basis of recall rate, precision rate, processing time, accuracy etc. KeywordsCharacter Recognition (OCR), Morphological Component Analysis (MCA), Undecimated Wavelet Transform (UWT), Discrete Wavelet Transform (DWT), Connected Component Analysis (CCA), Adaptive Local Connectivity Map (ALCM), Expectation Maximization (EM), Maximum Likelihood (ML), Spiral Run Length Smearing Algorithm (SRLSA), Resolution Enhancement (RE), Markov Random Field (MRF), Maximum A-posteriori Probability (MAP), Block Energy Analysis (BEA), Support Vector Machine (SVM), Thin Line Coding (TLC), Constrained Run Length Algorithm (CRLA). .

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.