Abstract

Problems such as complex image backgrounds, low image quality, diverse text forms, and similar or common character layouts in different script categories in natural scenes pose great challenges to scene script identification. This paper proposes a new Res2Net-based improved script identification method, namely FAS-Res2Net. In the feature extraction part, the feature pyramid network (FPN) module is introduced, which is beneficial to aggregate the geometric feature information extracted by the shallow network and the semantic feature information extracted by the deep network. Integrating the Adaptive Spatial Feature Fusion (ASFF) module is beneficial to obtain local feature information for optimal weight fusion. In addition, the global feature information of the image is extracted by introducing the swin transformer coding block, which makes the extracted feature information more abundant. In the classification part, the convolutional classifier is used to replace the traditional Linear classification, and the classification confidence of each category is output, which improves the identification efficiency. The improved algorithm achieved identification rates of 94.7% and 96.0% on public script identification datasets SIW-13 and CVSI-2015, respectively, which verified the superiority of the method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.