Abstract

The music industry is investing each year huge amounts of money to artists and their songs with the ultimate goal of becoming a hit song. Since its creation back in 1958, the Billboard Weekly Hot 100 chart is one of the most iconic and reliable sources of hit songs. Using Spotify’s Web API and Genius API and their massive collection of songs we gathered all the high-level audio features, lyrics and some temporal features for all the songs that made it to the Hot 100 Chart in the period 1958-2020. Using these features, we will perform an analysis on the time varying preferences on what is considered a hit song using One-Class-SVM and conclude that most of the hit songs are very similar based on their high-level audio features and lyric word-embeddings. Then, to support our results and hypothesis even more, we will try to build a multi-class classifier using algorithms such as Random Forest, KNN, Logistic regression and Support Vector Machines (SVC with RBF kernel) to predict the position/popularity of a hit song on the billboard chart. Finally, we will address our thoughts on why these features may or may not be enough to build a hit song classifier and discuss future work for a better approach to this problem.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call