Abstract

Speaker recognition systems use a model that learns a speaker's speech by inputting an audio recording and processing it. Time-varying signal, with frequencies that continuously change, is identified as a speech signal. There are many uncertain attributes to speech; thus traditional speech recognition techniques such as using zero crossings and the Fourier Transform are not up to the task. It aims to be accomplished with the aim of helping two causes. The first part is designed to address speaker identification technology that is resistant to noise. While most prior solutions have relied on changing mel frequency cepstrum coefficients, with a Fundamental frequency feature coefficient, this proposal integrates both of these modifications with a new cepstrum component. In order to construct the feature matrix, the system is fed with two-hundred and fifty speech imprints that are used to apply features extraction techniques. The matrix is used to teach the algorithm about features, and each one is then evaluated using incomplete data (thirty percent of total data in features matrix). Speaker recognition models with improved accuracy are developed by studying the algorithms invasively. These variables (metrics) are generated for each algorithm and applied to the algorithm for recognition accuracy and the time required to achieve that accuracy. When tested against previous research, the findings show that the Feed Forward Neural Network-based Particle Swarm Optimization method has been better. This model can accurately identify 96% of the input with less processing time. According to the findings, optimization utilizing advanced particle swarm optimization (a.k.a. Particle Swarm Optimization) is most likely responsible for the higher accuracy seen in speaker identification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call