Abstract

<div class="section abstract"><div class="htmlview paragraph">Deriving a meaningful caption for a Traffic Sign image is a challenging topic which is never addressed in prior literature. It is gaining a lot of traction with the introduction of tools and technologies in the fields of Natural Language Processing and Computer Vision. We discuss several concepts in this article, ranging from utilizing y, u, v color-space images to construct an artificial neural network to a single-mode neural architecture that functionally integrates image feature information with textual knowledge. We believe we are the first in the field to apply image captioning to traffic signs. Our model has been tested on GTSRB dataset and obtained a BLEU-1 score of 0.91. While many image captioning methods achieve good results by stacking one feature model on top of the sequential model, our traffic sign captioning work achieved better results by defining two concurrent models, one for feature extraction and one for sequence processing, and fusing them into a single architecture. Our architecture is quicker and capable of learning lengthy dependencies because of the usage of pre-trained models and an LSTM-based network.</div></div>

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call