Abstract

OCR (optical character recognition) is a technology that is commonly used for recognizing patterns artificial intelligence & computer machine. With the help of OCR we can convert scanned document into editable documents which can be further used in various research areas. In this paper, we are presenting a character segmentation technique that can segment simple characters, skewed characters as well as broken characters. Character segmentation is very important phase in any OCR process because output of this phase will be served as input to various other phase like character recognition phase etc. If there is some problem in character segmentation phase then recognition of the corresponding character is very difficult or nearly impossible.

Highlights

  • OCR is a technology that enables us to convert different types of scanned document into editable documents

  • Binarization Noise detection & Removal e) Feature Extraction: After performing the segmentation process features can be extracted for corresponding characters by using various feature extracting techniques

  • Character segmentation is a procedure in which from the word segmentation we take out only characters

Read more

Summary

Introduction

OCR is a technology that enables us to convert different types of scanned document into editable documents. It is a part of electronic document Analysis system. It is used to extract text from scanned images of type written, handwritten or printed text. Process of OCR can be described as following: Scanned image. E) Feature Extraction: After performing the segmentation process features can be extracted for corresponding characters by using various feature extracting techniques

Character Segmentation
Skewed Character
Our Approach
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call