Abstract

Image caption aims to automatically generate a descriptive text from the image. In this paper, we get a more effective model than the infrastructure by replacing the pre-trained feature extraction model VGG and LSTM with DenseNet and GRU. Experimental results show that the training time of the modified model is shorter than that of the original model and the performance can be guaranteed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call