Abstract

With the continuous development of artificial intelligence, cross-field applications in image recognition, natural language processing, and semantic recognition have gradually emerged. Video subtitle generation is one of the typical cross-field application scenarios. By recognizing image semantics and using natural language for generative content construction, users can often understand the content of images intuitively and quickly. In addition, when the contents are displayed in text form, it is convenient for textual retrieval or other text-based image applications. This paper proposes the application of generative adversarial networks in subtitle recognition. By focusing more on generation diversity, the degree of the differentiated description of generated content can be further improved, making image and video subtitles generation content prone to human habits. Specifically, the idea is to use the generative adversarial network to generate the subtitles of the images in the video and use diversified generative models to conduct difference optimization to obtain better generative performance. It is hoped that the research in this paper can help the application of diversified video subtitles generation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call