Abstract
Abstract Text is an important invention of humanity, which plays a key role in human life, so far from dark ages. Text in image is closely related to the scene or a product and is widely used in vision based application. In this paper we are addressing the problem of visual understanding with text. The main focus is combining textual cues and visual cues in deep neural network. First the text is recognized and classified from the image. Then we combine the attended word embedding and visual feature vector which are then optimized by CNN for Fine-grained image classification. We carried out the experiments on soft drink dataset in Pakistan. The results shows that the system achieves significant performance which can be potentially beneficial for real world application e.g. product search.
Highlights
Fine-Grained image classification [1, 28] is a real-world emerging problem and it has received great attention from research communities around the globe
Fine-grained image classification [1] is being applied for natural scene classification
In the paper of Yingying Zhu et al [14], they discussed in detail the recent advances and future trends for scene text detection and recognition
Summary
Fine-Grained image classification [1, 28] is a real-world emerging problem and it has received great attention from research communities around the globe. Using fine-grained image classification [28] techniques on the soft drink dataset is an area that has received limited attention from the researchers, whereas this area has exciting applications in restaurants and shops, where automated orders could be places once a specific brand of soft drink is going to be out of stock. Yao et al [11] present a unified framework for detecting and recognizing the text in images to handle the texts of different orientations They provide a method for ‘search dictionary’ to correct the recognition errors. The works of Shiva kumara et al [12] and C Yao et al [13] realize the significance of multi-oriented text detection and recognition to the research community. In the paper of Yingying Zhu et al [14], they discussed in detail the recent advances and future trends for scene text detection and recognition
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Advanced Network, Monitoring and Controls
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.