Abstract

After entering the digital era, digital music technology has prompted the rise of Internet companies. In the process, it seems that Internet music has made some breakthroughs in business models; yet essentially, it has not changed the way music content reaches users. In the past, different traditional and shallow machine learning techniques are used to extract features from musical signals and classify them. Such techniques were cost-effective and time-consuming. In this study, we use a novel deep convolutional neural network (CNN) to extract multiple features from music signals and classify them. First, the harmonic/percussive sound separation (HPSS) algorithm is used to separate the original music signal spectrogram into temporal and frequency components, and the original spectrogram is used as the input of the CNN. Finally, the network structure of the CNN is designed, and the effect of different parameters on the recognition rate is investigated. It will fundamentally change the way music content reaches music users and is a disruptive technology application for the industry. Experimental results show that the proposed recognition rate of the GTZAN dataset is about 73% with no data expansion.

Highlights

  • Digital music refers to music that has taken shape along with the development of the Internet and is stored and streamed in digital format

  • In December 2019, Beijing promulgated the “Implementation Opinions on Promoting the Prosperous Development of Beijing’s Music Industry,” which corresponded to the formation and development of the digital music industry. e capital city of Beijing gathers the best resources of music enterprises, talents, technology, scientific research, and education in the country. e promulgation and implementation of this opinion are of great strategic significance for China’s digital music industry

  • The harmonic/percussive sound separation (HPSS) algorithm is used to separate the music tracks, and the original tracks are separated into harmonic and percussive sources; the short-time Fourier transform is applied to these two sources and the original tracks, and the transformed spectra are input into the convolutional neural network (CNN) for learning, training, and prediction, and the final output result is the final recognition rate

Read more

Summary

Introduction

Digital music refers to music that has taken shape along with the development of the Internet and is stored and streamed in digital format. E key to the music style classification problem is the feature extraction of musical information. Feature extraction and classification in recognition tasks are two separate processing stages, but this study integrates these two stages to better achieve the interaction between the information. Recognition, by training a convolutional deep confidence network with 2 hidden layers, the convolutional deep belief network (CDBN) in an unsupervised manner to try to activate the hidden layers and generate meaningful features from the preprocessed spectrum Compared with those standard Mel frequency cepstral coefficients (MFCCs) features, its DL features have higher accuracy. The music data are transformed into MFCC feature vectors and fed into a CNN with three hidden layers that automatically extract image features for classification, which shows that the CNN has a strong ability to capture the changing image information features. Use the brain decoded focus dynamics; they examined how numerous properties of sound affected focus levels in diverse jobs

Related Work
Experimental Results and Analysis
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call