Abstract

Financial reimbursement is considered as a cumbersome process of information extraction from actual invoices. Recently, scanner aided intelligent reimbursement methods using scanned images have been developed to reduce the manpower and process time but it lacks flexibility as well portability. Recently, smart phone is becoming more and more general in daily life and hence it is considered a natural way to develop smart phone aided intelligent reimbursement system (IRIS). However, it is difficult to effectively recognize the text information on the distorted and oblique invoices images in natural scene, which are took by smart phone, and the standard formatted output cannot be realized for the offset and misplaced text. In order to solve these problems, we propose an effective smart phone aided intelligent reimbursement method based on deep learning. First, we preprocess distorted images and then propose a hough transform accumulator (HTA) algorithm, which adds an accumulator based on the Hough transform to achieve tilt correction and image recovery operation of the distorted image. Second, in order to remove the redundant information on the invoice image in the natural scene, we apply the you only look once-version 3 (YOLOv3) algorithm to accurately locate, segment and intercept the key information areas on the invoice image. Third, we adopt the connectionist text proposal network (CTPN) to detect the import text information block areas in invoice images, and densely connected convolutional networks (DenseNets) to identify the detected text. The connectionist temporal classification (CTC) algorithm is added to the Densenets network to achieve alignment of the input and output formats of the text, accurate optical character recognition (OCR) is performed on the intercepted block area invoice image. Finally, we proposed a new algorithm Regular Matching and Recursive Segmentation (RMRS) based on recursive segmentation of regular matching, which performs standard formatted output on misaligned or offset information. The average accuracy of the recognition of optical characters in all block areas on the invoice is as high as 0.991 with a minimum of 0.962.

Highlights

  • Deep learning aided intelligent processing techniques play more and more important role in the invoice reimbursementThe associate editor coordinating the review of this manuscript and approving it for publication was Takuro Sato.systems [1]

  • Aiming at the invoice images with certain inclination angle captured by intelligent terminals in natural scenes, we propose a Hough Transform Accumulator (HTA) algorithm, which can intelligently correct the inclination of invoice images with different sizes, pixels, resolutions and backgrounds

  • In order to solve this problem, we propose a novel algorithm via combining regular matching and recursive segmentation (RMRS), which will output the text information after optical character recognition (OCR) in standard format

Read more

Summary

INTRODUCTION

It is natural to introduce deep learning into the intelligent recognition of text information in invoice images [29]. It is a great challenge to accurately segment [30] the invoice images captured by intelligent terminals and extract the key fields concerned by the financial department and the corresponding information of the key fields. After extracting all segmented [40] regional images, optical character recognition (OCR) [41] is carried out on the information in the invoice image, and the characters are output in the form of text format. In order to solve this problem, we propose a novel algorithm via combining regular matching and recursive segmentation (RMRS), which will output the text information after OCR in standard format. Smart phone aided IRIS runs on the GPU, with high processing speed, and the accuracy of extracting invoice information is extremely high

SYSTEM ARCHITECTURE
EXPERIMENTS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call