MT-YOLOv5: Mobile terminal table detection model based on YOLOv5

Zixin Ning,Yanqin Yang,Jing Yang,Xinjiao Wu

doi:10.1088/1742-6596/1978/1/012010

Zixin Ning, Yanqin Yang + Show 2 more

Open Access

https://doi.org/10.1088/1742-6596/1978/1/012010

Copy DOI

Abstract

Table detection is an important task of optical character recognition(OCR). At present, table detection for desktop applications has basically reached commercial requirements. With the advancement of informatization, personal demand for table detection has gradually increased. There is an urgent need to establish a table detection method that can be deployed on handheld devices. This paper proposes a mobile terminal table detection model based on YOLOv5. First, we used YOLOv5 as the main framework of the model. However, considering the problem of connection redundancy in the backbone of YOLOv5, on the basis of retaining the YOLOv5 multi-scale detection head, we replaced the backbone of YOLOv5 with the same excellent Mobilenetv2. In addition, considering the non-linear defects of the lightweight model, we use deformable convolution to make up for it. This paper has been evaluated on the ICDAR 2019 dataset, and the results show that compared with the baseline model, the model reduces the number of parameters by half and increases the detection speed by 47%. At the same time, the model can reach 35.25 FPS on ordinary Android phones.

Full Text