A Speech Emotion Recognition Model Based on Multi-Level Local Binary and Local Ternary Patterns

Yesim Ulgen Sonmez,Asaf Varol

doi:10.1109/access.2020.3031763

Abstract

Interpreting a speech signal is quite challenging because it consists of different frequencies and features that vary according to emotions. Although different algorithms are being developed in the speech emotion recognition (SER) domain, the success rates vary according to the spoken languages, emotions, and databases. In this study, a new lightweight effective SER method has been developed that has low computational complexity. This method, called 1BTPDN, is applied on RAVDESS, EMO-DB, SAVEE, and EMOVO databases. First, low-pass filter coefficients are obtained by applying a one-dimensional discrete wavelet transform on the raw audio data. The features are extracted by applying textural analysis methods, a one-dimensional local binary pattern, and a one-dimensional local ternary pattern to each filter. Using neighborhood component analysis, the most dominant 1024 features are selected from 7680 features while the other features are discarded. These 1024 features are selected as the input of the classifier which is a third-degree polynomial kernel-based support vector machine. The success rates of the 1BTPDN reached 95.16%, 89.16%, 76.67%, and 74.31% in the RAVDESS, EMO-DB, SAVEE, and EMOVO databases, respectively. The recognition rates are higher compared to many textural, acoustic, and deep learning state-of-the-art SER methods.

Highlights

Speech processing methods are used in the domain of humancomputer interaction (HCI) such as security applications, computer education applications, vehicle card systems, automatic translation systems, call center applications, psychosis monitoring and diagnosis of neuropsychological disorders, voice message sorting, telecommunication, assistive technologies, and audio mining [1]
We propose a novel SER method called speech emotion recognition model based on multi-level local binary pattern and local ternary pattern, which has been abbreviated as 1BTPDN
A novel text-independent and speakerindependent, SER method, called 1BTPDN has been developed with a lightweight method that solves a nonpolynomial problem by extracting handcrafted features

Summary

Introduction

Speech processing methods are used in the domain of humancomputer interaction (HCI) such as security applications, computer education applications, vehicle card systems, automatic translation systems, call center applications, psychosis monitoring and diagnosis of neuropsychological disorders, voice message sorting, telecommunication, assistive technologies, and audio mining [1]. It is used in digital forensics, games, robots, and the legal evaluation of an individual’s psychological integrity [2]. LBP and LTP are used in two dimensional (2D) images for texture segmentation and feature detection in image processing They have computational and programming simplicity, which make them utilizable for realtime applications [38]

Methods

Findings

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 70	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Speech Emotion Recognition Model Based on Multi-Level Local Binary and Local Ternary Patterns

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Local $N$ -Ary Pattern and Its Extension for Texture Classification
Sheng Wang ... Xiangjian He
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 25
Sheng Wang, et. al.Sheng Wang ... Xiangjian He
01 Sep 2015
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 25

Image-Based Facial Expression Recognition Using Local Neighborhood Difference Binary Pattern
Sumeet Saurav ... Ravi Saini
-
Sumeet Saurav, et. al.Sumeet Saurav ... Ravi Saini
01 Nov 2019
01 Nov 2019

Local spatial binary pattern: a new feature descriptor for content-based image retrieval
Shouhong Wan ... Yu Xia
-
Shouhong Wan, et. al.Shouhong Wan ... Yu Xia
10 Jan 2014
10 Jan 2014

Developing a Novel Approach for Stone Porosity Computing Using Modified Local Binary Patterns and Single Scale Retinex
Farshad Tajeripour ... Shervan Fekri-Ershad
Arabian Journal for Science and Engineering | VOL. 39
Farshad Tajeripour, et. al.Farshad Tajeripour ... Shervan Fekri-Ershad
15 Sep 2013
Arabian Journal for Science and Engineering | VOL. 39

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Speech Emotion Recognition Model Based on Multi-Level Local Binary and Local Ternary Patterns

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access