Towards a Framework for Acquisition and Analysis of Speeches to Identify Suspicious Contents through Machine Learning

Md Rashadur Rahman,Mohammad Ashfak Habib,A S M Kayes,Mohammad Shamsul Arefin,Md Billal Hossain,Abd E.I.-Baset Hassanien

doi:10.1155/2020/5639787

Abstract

The most prominent form of human communication and interaction is speech. It plays an indispensable role for expressing emotions, motivating, guiding, and cheering. An ill-intentioned speech can mislead people, societies, and even a nation. A misguided speech can trigger social controversy and can result in violent activities. Every day, there are a lot of speeches being delivered around the world, which are quite impractical to inspect manually. In order to prevent any vicious action resulting from any misguided speech, the development of an automatic system that can efficiently detect suspicious speech has become imperative. In this study, we have presented a framework for acquisition of speech along with the location of the speaker, converting the speeches into texts and, finally, we have proposed a system based on long short-term memory (LSTM) which is a variant of recurrent neural network (RNN) to classify speeches into suspicious and nonsuspicious. We have considered speeches of Bangla language and developed our own dataset that contains about 5000 suspicious and nonsuspicious samples for training and validating our model. A comparative analysis of accuracy among other machine learning algorithms such as logistic regression, SVM, KNN, Naive Bayes, and decision tree is performed in order to evaluate the effectiveness of the system. The experimental results show that our proposed deep learning-based model provides the highest accuracy compared to other algorithms.

Highlights

Speech has been the mostly used medium for conveying information among people all over the world since the dawn of civilization
We have presented a framework for acquisition of speech along with the location of the speaker, converting the speeches into texts and, we have proposed a system based on long short-term memory (LSTM) which is a variant of recurrent neural network (RNN) to classify speeches into suspicious and nonsuspicious
Our model was compared with other models based on other machine learning algorithms like Naive bayes, support vector machine (SVM), decision tree, k-nearest neighbor, and logistic regression

Summary

Introduction

Speech has been the mostly used medium for conveying information among people all over the world since the dawn of civilization. It is a very dynamic way to shift a huge number of people’s mindset or to reinforce their confidence in speaker [1]. It played a significant role in persuading the audience into a specific agenda [2, 3]. People are freely expressing their emotion, thought, anger, and grudge through speeches. Often this freedom of expression is misused by certain people in society, which causes social controversies [8, 9]. It threatens a country’s people’s lives and livelihood and undermines

Methods

Results

Conclusion