Abstract

Creating impressive video content such as movies and advertisements is a very important yet challenging task in business that requires both a sense of creativity and a lot of experience. Even professionals cannot necessarily invoke the impressions and emotions that they have aimed at. Many video advertisements are created and then disappear without giving a large impact on viewers. This paper presents a large-scale dataset of television (TV) advertisements that consists of 14,490 videos. The impressions of each video such as the recognition rate and interestingness rate are from the results of questionnaires answered by 620 participants. We also present a baseline for predicting the impression effects of TV advertisements by using visual and audio information, metadata such as broadcasting pattern, business category, the popularity of the casts, and text information including texts appearing on videos and narrations in audios. We predict four impressions of the viewers: 1) how much participants remember the video afterward, 2) how much they feel like buying the product/service, 3) how much they become interested in the product/service, and 4) how much they like the content of the advertisement itself. By combining images, audio, metadata, cast data, and text data, our baseline method is able to predict such impressions with a correlation of 0.69-0.82, much better than using a single-modal feature such as visual data or audio data only. This paper also gives some possible applications such as estimating the importance scores of each key frame, which gives us informative insights about how to make the advertisement content more impressive.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call