UIT-MLReceipts: A Multilingual Benchmark for Detecting and Recognizing Key Information in Receipts

Khang Tan Tran Minh Nguyen

doi:10.21553/rev-jec.330

Abstract

The 4.0 industrial evolution has paved the way for development potential and revolution in Vietnam. In this movement, digitization appears to be necessary to transform numerous traditional economic sectors. It will provide valuable digital data for many automation applications and decision-making processes. Particularly in the retail industry, data has long played a vital factor. Hence, digitizing documents such as receipts can help businesses in management and enterprise development. Nevertheless, the digital transformation process is still slow because of the shortage of cleaned datasets for this type of document. This paper introduces a new dataset named UIT-MLReceipts for extracting key information in receipts. The task includes two sub-tasks: Receipt Text Detection (RTD) and Receipt Text Recognition (RTR). We thoroughly evaluate current state-of-the-art Receipt Text Detection using Faster R-CNN, YOLOv3, YOLOF, and Faster R-CNN with Precise RoI-Pooling on our dataset. To evaluate the performance of Receipt Text Recognition, we experiment with two text recognition baselines: RobustScanner and SATRN. Experimental results indicate that the Faster R-CNN with Precise RoI-Pooling outperforms the competitors and achieves the best mean Average Precision (mAP) score at 51.6% in the Receipt Text Detection task. With the Receipt Text Recognition task, results show that SATRN performs better.

Full Text