Abstract

Recently, labor shortages have occurred in various industries. In response to such labor shortages, improvement of various operations is required. In recent years, the Internet has developed rapidly, and information has been converted into data. That is, the information on the paper is converted into data and managed by the personal computer. That is because converting paper data into digital data enables space saving, quick search, and security measures. However, there are challenges in the character recognition rate when converting to digital data. There are various kinds of characters in the world, and it is very difficult to recognize them by 100% system. One of the essential tools for data conversion is OCR. Although OCR is a widely used optical character recognition system, it may be misrecognized depending on the environment and equipment. Although high-precision character recognition can be performed for characters with good print and print quality, the recognition rate will be low if low-quality characters such as fax and copied documents are used. If misrecognized, additional work will occur and work efficiency will decline. Therefore, I did development with two research goals: improving the character recognition rate and improving the business efficiency. In this research, we construct a simple misrecognition correction processing system using a database.

Highlights

  • Converse paper printed characters to digital text data is still an important task that many companies are working on it

  • Labor shortages have occurred in various industries

  • There are challenges in the character recognition rate when converting to digital data

Read more

Summary

Proposal of character correction method using database

Abstract close to 100%, but low-quality characters such as fax and copy documents obtain a low recognition rate. This study proposes a new additional approach to correct the text database on the first stage OCR recognition result. The Internet has developed rapidly, and information has been converted into digital data. There are challenges in the character recognition rate when converting to digital data. This work targets on improving the character recognition rate for a paper printed order form sent by overseas FAX. If the printed document has fixed text items, the common knowledge base can be constructed by using a database. Using such common knowledge, the OCR result can be verified and corrected. SQLite is a database that gathers files and has a straightforward structure

Introduction
Pattern Matching with Database
Output the item name has the largest vote number
CONE BASE
Conclusions
Findings
Result and Consideration

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.