A Hybrid Meta-Heuristic Feature Selection Method for Identification of Indian Spoken Languages From Audio Signals

Aankit Das,Samarpan Guha,Norazak Senu,Ali Ahmadian,Pawan Kumar Singh,Ram Sarkar

doi:10.1109/access.2020.3028241

Abstract

With the recent advancements in the fields of machine learning and artificial intelligence, spoken language identification-based applications have been increasing in terms of the impact they have on the day-to-day lives of common people. Western countries have been enjoying the privilege of spoken language recognition-based applications for a while now, however, they have not gained much popularity in multi-lingual countries like India owing to various complexities. In this paper, we have addressed this issue by attempting to identify different Indian languages based on various well-known features like Mel-Frequency Cepstral Coefficient (MFCC), Linear Prediction Coefficient (LPC), Discrete Wavelet Transform (DWT), Gammatone Frequency Cepstral Coefficient (GFCC) as well as a few deep learning architecture based features like i-vector and x-vector extracted from the audio signals. After comparing the initial results, it is observed that the combination of MFCC and LPC produces the best results. Then we have developed a new nature-inspired feature selection (FS) algorithm by hybridizing Binary Bat Algorithm (BBA) with Late Acceptance Hill-Climbing (LAHC) to select the optimal subset from the said feature vectors in order to reduce the model complexity and help it train faster. Using Random Forest (RF) classifier, we have achieved an accuracy of 92.35% on Indic TTS database developed by IIT-Madras, and an accuracy of 100% on the Indic Speech database developed by the Speech and Vision Laboratory (SVL) IIIT-Hyderabad. The proposed algorithm is also found to outperform many standard meta-heuristic FS algorithms. The source code of this work is available at: https://github.com/CodeChef97dotcom/Feature-Selection

Highlights

Speech is one of the most innate human capabilities
We explore a new approach to develop a feature selection (FS) algorithm using a hybrid of Binary Bat Algorithm (BBA) and Late Acceptance Hill-Climbing (LAHC) algorithm for classifying Indian languages based on their Mel-frequency Cepstral Coefficient (MFCC) and Linear Prediction Coefficient (LPC) features
We have performed experiment on the database of 7 Indic languages [51], developed by Speech and Vision Laboratory (SVL) at IIIT-Hyderabad. This database consists of 1000 utterances for each of the 7 languages and each sentence is available as a separate audio clip in the database

Summary

Introduction

Speech is one of the most innate human capabilities. When we speak with one another, we use not just words and associated emotions and sentiments to convey meaning and get our opinions across. There are many features associated with spoken language that allow us to deliver information that. The associate editor coordinating the review of this manuscript and approving it for publication was K. Spoken language involves the actual use of speech or related utterances that convey meaning to share the thoughts or other information. Processing of spoken languages involves human-computer interaction (HCI) which has significantly improved over the last decade. Automatic language identification plays a vital role in a wide range of services. Almost everyone is equipped with smartphones which makes life much easier. People can control their daily activities like calling someone, turning on

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2020
Citations: 82	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Hybrid Meta-Heuristic Feature Selection Method for Identification of Indian Spoken Languages From Audio Signals

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

Optimizing Integrated Features for Hindi Automatic Speech Recognition System
Mohit Dua ... Rajesh Kumar Aggarwal
Journal of Intelligent Systems | VOL. 29
Mohit Dua, et. al.Mohit Dua ... Rajesh Kumar Aggarwal
01 Oct 2018
Journal of Intelligent Systems | VOL. 29

Evaluation of Machine Learning Algorithms using Combined Feature Extraction Techniques for Speaker Identification
Unwana Ubong Iwok ... Kufre Michael Udofia
Journal of Engineering Research and Reports | VOL. 25
Unwana Ubong Iwok, et. al.Unwana Ubong Iwok ... Kufre Michael Udofia
19 Sep 2023
Journal of Engineering Research and Reports | VOL. 25

Comparative analysis of various feature extraction techniques for classification of speech disfluencies
Nitin Mohan Sharma ... Vaibhav Gandhi
Speech Communication | VOL. 150
Nitin Mohan Sharma, et. al.Nitin Mohan Sharma ... Vaibhav Gandhi
23 Apr 2023
Speech Communication | VOL. 150

Speaker identification: A way to reduce call-sign confusion events
Sara Sekkate ... Abdellah Adib
-
Sara Sekkate, et. al.Sara Sekkate ... Abdellah Adib
01 May 2017
01 May 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Hybrid Meta-Heuristic Feature Selection Method for Identification of Indian Spoken Languages From Audio Signals

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions