The article contains a review of the research on the use of deep learning in speaker identification. It examines the problems of voice identification, highlighting the relevance and the need for effective methods in this area. The evolution of speaker identification techniques from simple pattern matching to complex neural architectures is traced to understand the technological advancements in this field. Modern methods for speaker identification and the prospects for the development of such systems are considered. The two aims set by the authors are: to make comparative analysis of deep learning with traditional methods and to review the current state of technology. First they highlight the key differences and advantages of deep learning compared to traditional approaches to speaker identification, describe the challenges in deep learning methods, such as the necessity for large datasets and computational resources, and analyse how these issues are addressed by the research community. Then the authors provide a comprehensive overview of the current deep learning methods used for speaker identification, including the latest breakthroughs and innovations in neural network architectures, training techniques, and feature extraction methods. The potential of unsupervised and semi-supervised learning paradigms to further enhance speaker identification systems is explored, offering insights into the future research in this field. Key words: deep learning, speaker identification, neural networks, recurrent layers, convolutional layers, voice recognition technologies.
Read full abstract