Abstract
Recently, labor shortages have occurred in various industries. In response to such labor shortages, improvement of various operations is required. In recent years, the Internet has developed rapidly, and information has been converted into data. That is, the information on the paper is converted into data and managed by the personal computer. That is because converting paper data into digital data enables space saving, quick search, and security measures. However, there are challenges in the character recognition rate when converting to digital data. There are various kinds of characters in the world, and it is very difficult to recognize them by 100% system. One of the essential tools for data conversion is OCR. Although OCR is a widely used optical character recognition system, it may be misrecognized depending on the environment and equipment. Although high-precision character recognition can be performed for characters with good print and print quality, the recognition rate will be low if low-quality characters such as fax and copied documents are used. If misrecognized, additional work will occur and work efficiency will decline. Therefore, I did development with two research goals: improving the character recognition rate and improving the business efficiency. In this research, we construct a simple misrecognition correction processing system using a database.
Highlights
Converse paper printed characters to digital text data is still an important task that many companies are working on it
Labor shortages have occurred in various industries
There are challenges in the character recognition rate when converting to digital data
Summary
Abstract close to 100%, but low-quality characters such as fax and copy documents obtain a low recognition rate. This study proposes a new additional approach to correct the text database on the first stage OCR recognition result. The Internet has developed rapidly, and information has been converted into digital data. There are challenges in the character recognition rate when converting to digital data. This work targets on improving the character recognition rate for a paper printed order form sent by overseas FAX. If the printed document has fixed text items, the common knowledge base can be constructed by using a database. Using such common knowledge, the OCR result can be verified and corrected. SQLite is a database that gathers files and has a straightforward structure
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.