Abstract

Signboards are important location landmarks that provide services to a local community. Non-disabled people can easily understand the meaning of a signboard based on its special shape; however, visually impaired people who need an assistive system to guide them to destinations or to help them understand their surroundings cannot. Currently, designing accurate assistive systems remain a challenge. Computer vision struggles to recognize signboards due to the diverse designs that combine text and images. Moreover, there is a lack of datasets to train the best model and reach good results. In this paper, we propose a novel framework that can automatically detect and recognize signboard logos. In addition, we utilize Google Street View to collect signboard images from Taiwan's streets. The proposed framework consists of a domain adaptation that not only reduces the loss function between source-target datasets, but also represents important source features adopted by the target dataset. In our model, we add nonlocal blocks and attention mechanisms called deep attention networks to achieve the best final result. We perform extensive experiments on both our dataset and public datasets to demonstrate the superior performance and effectiveness of our proposed method. The experimental results show that our proposed method outperforms state-of-the-art methods across all evaluation metrics.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.