Abstract

The central theme of this chapter is to explore classification algorithms on complex data sets which have a high number of features. The data set used in this project is the GTZAN data set, which has 60 quantified features extracted from 10 genres of music. The chapter addresses Music Information Retrieval (MIR) and sheds light on the features involved in audio signal processing, its importance, and ways to model it. Many studies have underlined the classification using complex machine learning models, but the absence of any project that highlights the use of well-known statistical classification methods encouraged us to tread this path. By the end of this chapter, one will have a detailed understanding of K-nearest neighbors, Fisher linear discriminant analysis, quadratic discriminant analysis, and feedforward neural networks. The chapter will also give contrasts between the results obtained after fitting the data set in these classification models. MIR is one of the fastest developing fields that have many applications in music informatics; many audio streaming applications use classification techniques extensively to provide their users with the best recommendations and quantify their subjective music taste. Another popular application of audio signal processing is in the fast-growing podcast and audiobooks industry, recommendations of podcasts, and contextual understanding of books to generate useful summaries and narrate them using digital assistants are a few use cases. Therefore it will be fruitful to know what techniques can be best suited for the job.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call