Singer Identification by Vocal Parts Detection and Singer Classification Using LSTM Neural Networks

Seyed Kooshan,Hashemi Fard,Rahil Mahdian Toroghi

doi:10.1109/pria.2019.8786009

Abstract

Singer Identification has been considered as a major field of research in the audio signal processing. It has gained the researchers interests in two main branches, 1) detecting singing parts of a polyphonic music, and 2) identifying the singers. Here, we intend to tackle both problems, at the same time. Methods such as GMM, SVM, and HMM have already been used in singer identification problem. In this paper, we proposed a singer recognition system by incorporating deep learning and Feed-Forward neural networks, which have not already been used for this purpose to the best of our knowledge. The preprocessing involves, studying of large sets of audio features in order to extract the most efficient set for the recognition stage. Our work is divided into multiple stages. First, the vocal frames of all music clips are detected using an LSTM network which can perform well for the time series data, such as audio signals. Then, an MLP network is incorporated and compared with an SVM classifier in order to classify the singers' gender. Finally, another LSTM network is involved to detect each singer's ID, and compared with an MLP network in the same task. In every step different classifiers are examined and the results are compared, which confirm the efficacy of our method compared to the state-of-the-art.

Full Text