Using Text and Visual Cues for Fine-Grained Classification

Zaryab Shaker,Muhammad Adeel Ahmed Tahir,Xiao Feng

doi:10.21307/ijanmc-2021-026

Zaryab Shaker, Muhammad Adeel Ahmed Tahir + Show 1 more

Open Access

https://doi.org/10.21307/ijanmc-2021-026

Copy DOI

Abstract

Abstract Text is an important invention of humanity, which plays a key role in human life, so far from dark ages. Text in image is closely related to the scene or a product and is widely used in vision based application. In this paper we are addressing the problem of visual understanding with text. The main focus is combining textual cues and visual cues in deep neural network. First the text is recognized and classified from the image. Then we combine the attended word embedding and visual feature vector which are then optimized by CNN for Fine-grained image classification. We carried out the experiments on soft drink dataset in Pakistan. The results shows that the system achieves significant performance which can be potentially beneficial for real world application e.g. product search.

Highlights

Fine-Grained image classification [1, 28] is a real-world emerging problem and it has received great attention from research communities around the globe
Fine-grained image classification [1] is being applied for natural scene classification
In the paper of Yingying Zhu et al [14], they discussed in detail the recent advances and future trends for scene text detection and recognition

Summary

INTRODUCTION

Fine-Grained image classification [1, 28] is a real-world emerging problem and it has received great attention from research communities around the globe. Using fine-grained image classification [28] techniques on the soft drink dataset is an area that has received limited attention from the researchers, whereas this area has exciting applications in restaurants and shops, where automated orders could be places once a specific brand of soft drink is going to be out of stock. Yao et al [11] present a unified framework for detecting and recognizing the text in images to handle the texts of different orientations They provide a method for ‘search dictionary’ to correct the recognition errors. The works of Shiva kumara et al [12] and C Yao et al [13] realize the significance of multi-oriented text detection and recognition to the research community. In the paper of Yingying Zhu et al [14], they discussed in detail the recent advances and future trends for scene text detection and recognition

Text detection and recognition

Fine-Grained Classification

Attention Mechanism

Multimodal Fusion

PROPOSED METHODOLOGY

Dataset

Implementation Details

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Network, Monitoring and Controls	Publication Date: Jan 1, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Using Text and Visual Cues for Fine-Grained Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Network, Monitoring and Controls

Lead the way for us

Similar Papers

The application of two-level attention models in deep convolutional neural network for fine-grained image classification
Tianjun Xiao ... Yichong Xu
-
Tianjun Xiao, et. al. Tianjun Xiao ... Yichong Xu
01 Jun 2015
01 Jun 2015

Integrating Scene Text and Visual Appearance for Fine-Grained Image Classification
Xiang Bai ... Mingkun Yang
IEEE Access | VOL. 6
Xiang Bai, et. al.Xiang Bai ... Mingkun Yang
01 Jan 2018
IEEE Access | VOL. 6

Con-Text: Text Detection for Fine-Grained Object Classification.
Sezer Karaoglu ... Jan C Van Gemert
IEEE Transactions on Image Processing | VOL. 26
Sezer Karaoglu, et. al.Sezer Karaoglu ... Jan C Van Gemert
24 May 2017
IEEE Transactions on Image Processing | VOL. 26

Text Semantic Steganalysis Based on Word Embedding
Xin Zuo ... Huanhuan Hu
-
Xin Zuo, et. al.Xin Zuo ... Huanhuan Hu
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using Text and Visual Cues for Fine-Grained Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Network, Monitoring and Controls