Abstract

In this study, a hybrid scheme that combines Gaussian mixture model (GMM) and the k-means approach, called GMM-kmeans, is proposed for automatic audio segmentation (AAS) of popular music. Generally, the structure of a popular music is composed of verse, chorus and non-repetitive (such as intro, bridge and outro) segments. The combined GMM-kmeans scheme including mainly two developed algorithms, GMMAAS and SFS, will efficiently divide a song into these three parts. In GMM-kmeans, the GMM classifier is to recognize the vocal segments and then calculate the section boundary between them and non-repetitive sections first. The song with vocal segments extracted by GMM, containing only the remaining verse and chorus sections, is then analyzed by the k-means clustering algorithm where the verse section is further discriminated from the chorus section. In classification of verse and chorus by k-means, the developed switching frame search (SFS) algorithm with the devise of verse group-of-frames (Verse-GoF) and Chorus-GoF will accurately estimate the separation boundary of verse and chorus sections. Experimental results obtained from a musical data set of numerous Chinese popular songs show the superiority of both proposed GMMAAS and SFS.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.