Abstract

Invoice recognition has long been an active research direction in the field of image recognition. Existing invoice recognition methods suffer from a low recognition rate for structured invoices, a slow recognition speed, and difficulty in mobile deployment. To address these issues, we propose an invoice-structured recognition method based on the StrucTexT model. This method uses the idea of knowledge distillation to speed up the recognition speed and compress the model size without reducing the model recognition rate; this is achieved using the teacher model StrucTexT to guide the student model StrucTexT_slim. The method can effectively solve the problems of slow model recognition speed and large model size that make mobile deployment difficult with traditional methods. Experimental results show that the proposed model achieves an accuracy rate of over 94% on the SROIE and FUNSD public datasets and over 95% on the self-built structured invoice dataset. In addition, the method is 30% faster than other models (YOLOv4, LeNet-5, and Tesseract-OCR) in terms of recognition speed, while the model size is compressed by about 20%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.