Abstract
Recently, the progress on image understanding and AIC (Automatic Image Captioning) has attracted lots of researchers to make use of AI (Artificial Intelligence) models to assist the blind people. AIC integrates the principle of both computer vision and NLP (Natural Language Processing) to generate automatic language descriptions in relation to the image observed. This work presents a new assistive technology based on deep learning which helps the blind people to distinguish the food items in online grocery shopping. The proposed AIC model involves the following steps such as Data Collection, Non-captioned image selection, Extraction of appearance, texture features and Generation of automatic image captions. Initially, the data is collected from two public sources and the selection of non-captioned images are done using the ARO (Adaptive Rain Optimization). Next, the appearance feature is extracted using SDM (Spatial Derivative and Multi-scale) approach and WPLBP (Weighted Patch Local Binary Pattern) is used in the extraction of texture features. Finally, the captions are automatically generated using ECANN (Extended Convolutional Atom Neural Network). ECANN model combines the CNN (Convolutional Neural Network) and LSTM (Long Short-Term Memory) architectures to perform the caption reusable system to select the most accurate caption. The loss in the ECANN architecture is minimized using AAS (Adaptive Atom Search) Optimization algorithm. The implementation tool used is PYTHON and the dataset used for the analysis are Grocery datasets (Freiburg Groceries and Grocery Store Dataset). The proposed ECANN model acquired accuracy (99.46%) on Grocery Store Dataset and (99.32%) accuracy on Freiburg Groceries dataset. Thus, the performance of the proposed ECANN model is compared with other existing models to verify the supremacy of the proposed work over the other existing works.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have