Abstract

Abstract: Researchers in the fields of computer vision and natural language processing have been concentrating their efforts in recent years on automatically developing natural language descriptions for videos. Although video comprehension has a variety of applications, such as video retrieval and indexing, video captioning is a difficult topic to master due to the complex and diverse nature of video content. Understanding the relationship between video content and natural language sentences, on the other hand, is still a work in progress, and several approaches for improved video analysis are being developed. Because of their superior performance and high-speed computing capabilities, deep learning approaches have shifted their focus to video processing. This research aims at the end-to-end structure of a deep learning based encoder-decoder network for creating natural language descriptions for video sequences. The use of a CNN-RNN model paired with beam search to generate captions for the MSVD dataset is explored in this study. We have compared the results with beam search and greedy search approach. The generated captions from this model is generally grammatically incorrect. Our paper focuses on improving those grammatical errors using encoder-decoder model. Grammatical errors include spelling mistakes, incorrect use of articles, prepositions, pronouns, nouns, etc or even poor sentence construction. Using beam search for k=3, the captions generated by our algorithm get a BLEU score of 0.72. After passing the generated captions through a grammar error correction mechanism, the results improve to a BLEU score of 0.76. The results increased by 5.55% after grammar correction. The blue score reduces as the value of k decreases, but the time it takes to generate captions decreases as well. Index Terms: Video captioning, end-to-end structure, MSVD dataset, encoder-decoder model, beam search, grammar correction

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call