Abstract

In recent years, the application of speech emotion recognition (SER) in the supervision of Internet public opinion has received increasing attention. This study proposes a new SER algorithm to analyze the public opinion information of network platforms. Firstly, we extract different spectrum features from speech signals and combine them into frame level speech features. Then, we select conditional deep confidence network (CDBN) which has the ability to learn sequential features as the final classification model. We apply particle swarm optimization (PSO) and genetic algorithm (GA) during the fine-tuning stage of the CDBN to obtain more suitable optimal weights of the whole network, and propose the PSO-GA-CDBN (PGCDBN) model. Compare with the traditional back propagation (BP) algorithm, our training method accelerates the convergence speed of the network and improves the robustness and recognition performance of the network. In our experiment, we used the Chinese Academy of Sciences’ Institute of automation (CASIA) Chinese emotional corpus and self-collected Chinese speech datasets, which were collected from Sina Weibo, Tik tok and other online social media platforms. Compare with the popular emotion classifiers such as support vector machine (SVM), deep residual network (ResNet), long short-term memory (LSTM) neural network, DBN, our proposed PGCDBN achieves the best recognition results from both datasets. In addition, we use bidirectional LSTM before PGCDBN to further process the extracted speech features, and the result of bidirectional LSTM has stronger speech signal expression ability. The average recognition accuracy of this new hybrid deep learning model algorithm in two datasets is 98.67%, which can be used for the supervision of netizens’ opinions.

Highlights

  • Major disaster events often leave an indelible impact on the society

  • 1) In this paper, we focus on the selection of classifiers in speech emotion recognition, and propose a new method of speech emotion recognition based on conditional DBN (CDBN) classifier

  • The algorithm proposed in this paper is based on the bidirectional long short-term memory (LSTM) and particle swarm optimization (PSO)-genetic algorithm (GA) conditional deep belief network (PGCDBN) concept, which is used to identify emotions corresponding to the network public opinion by analyzing speech

Read more

Summary

Introduction

With the advancements made in the development of communication equipment, the participation in social networks have significantly increased, and the dissemination channels for news and information, as well as rumors and misinformation, about the global situation have increased. If not stopped in time, this dissemination of rumors and misinformation likely to cause social unrest. Traditional machine learning methods was often used in SER. Researchers often extracted artificially designed acoustic features from speech signals, and selected appropriate machine learning models as classifiers [3]–[6]. Traditional machine learning methods often can’t be used to extract the deeper features from the original machine learning model.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call