Summarization of Video using Audio

Pratiksha Yadav,Sahil Mandore,Nimish Rajgure,Rohan Thoke,Prof Pallavi Bhaskre

doi:10.22214/ijraset.2024.58785

Abstract

Abstract: Online education has emerged as a highly effective means of delivering quality education to students. Its popularity has increased due to the high-quality visual and graphical content, delivered by subject matter experts, and the convenience of learning anytime and anywhere. However, students may face time constraints that prevent them from fully engaging with the course content. To address this issue, video transcript summarizers have gained popularity. These tools extract the most important topics from a video, allowing students to understand the essence of the class without having to watch the entire video. Our system focuses on developing a module using txtai.pipeline with Python to summarize online class videos. We use Whisper, a general-purpose speech recognition model, to train our model on a large dataset of diverse audio. The model takes the URL of a video as input and uses two algorithms to summarize the content: TF-IDF and Gensim. The summarization process is subjective, and we have incorporated two prominent methods: cosine similarity and ROUGE score. The former does not require a human-generated summary for reference, while the latter does. Our results show that the efficiency obtained using cosine similarity is greater than 90% in both TF-IDF and Gensim cases. The efficiency obtained in the case of ROUGE score is between 40-50%.

Full Text