Abstract
Recently, cross-border logistics has experienced rapid development. Cross-border logistics courier orders come in various formats, featuring diverse layouts. Additionally, there is no standardized format for the writing of address and other information on these courier orders. It is challenging for current automated recognition models to handle such images. In this paper, we presented an end-to-end trainable neural network model based on feature enhancement, SwFB, capable of achieving end-to-end conversion from raw images to structured text information. We constructed our feature enhancement module, Co-G-Ma, based on a convolutional neural network (CNN), gated recurrent unit (GRU), and multi-head attention. We collected real cross-border logistics courier order images from a postal company in Zhejiang province, China, to build our dataset, COFIE, and conducted a series of experiments to explore the impact of hyperparameters on the extraction of key field text. Comparative experiments were also performed with other models on publicly available datasets CORD and SROIE. The experimental results demonstrate that our model achieves advanced performance in extracting visual text information and exhibits strong generalization.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have