Abstract
The huge amount of multimedia content accumulated daily has demanded the development of effective retrieval approaches. In this context, speaker recognition methods capable of automatically identifying a person through their voice is of great relevance. This paper presents a novel speaker recognition approach modelled in a retrieval scenario and using a recent unsupervised learning method. The proposed approach considers MFCC features and a Vector Quantization model to compute distances among audio objects. Next, a rank-based unsupervised learning method is used for improving the effectiveness of retrieval results. Several experiments were conducted considering three public datasets with different settings, such as background noise from diverse sources. Experimental results demonstrate that the proposed approach can achieve very high effectiveness results. In addition, effectiveness gains up to +27\% were obtained by the unsupervised learning procedure.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.