Abstract

In this paper, closed set and open set Speaker Identification has been performed on two different databases. Feature extraction for the Identification has been done by using the amplitude distribution of four different Transforms i.e. DFT, DHT, DCT and DST. Two similarity measures i.e. Euclidean Distance (ED) and Manhattan Distance (MD) have been used for matching. The performance has been compared with respect to the best value of each Transform for following parameters: length of speech sample, similarity measure score, size of feature vector, FAR/FRR performance and data acquisition system. Amongst the transforms the best result is given by DFT at 99.06% for feature vector of size 32. Amongst similarity measures Manhattan distance outnumbers the Euclidean distance by 54 to 15, considering the results of all three lengths of speech. The best GAR is 90.65% with a threshold of 94.11% for DFT with MD as a similarity measure.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.