Abstract
The present paper proposes a new approach for detecting music boundaries, such as the boundary between music pieces or the boundary between a music piece and a speech section for automatic segmentation of musical video data and retrieval of a designated music piece. The proposed approach is able to capture each music piece using acoustic similarity defined for short-term segments in the music piece. The short segmental acoustic similarity is obtained by means of a new algorithm called segmental continuous dynamic programming, or segmental CDP. The location of each music piece and its music boundaries are then identified by referring to multiple similar segments and their location information, avoiding oversegmentation within a music piece. The performance of the proposed method is evaluated for music boundary detection using actual music datasets. The present paper demonstrates that the proposed method enables accurate detection of music boundaries for both the evaluation data and a real broadcasted music program.
Highlights
Hard discs have recently come into widespread use, and the medium of the home video recorder is changing from sequential videotape to media such as random accessible hard discs or DVDs
The present paper proposes a new method for identifying the location of each music piece and detecting the boundaries between music pieces avoiding oversegmentations within a music piece for automatic segmentation of video data
Our preliminary experiments showed that the Gaussian mixture models (GMM), which is a typical method of discrimination between music and voice, could not detect music boundaries in continuous music pieces
Summary
Hard discs have recently come into widespread use, and the medium of the home video recorder is changing from sequential videotape to media such as random accessible hard discs or DVDs. The present paper proposes a new method for identifying the location of each music piece and detecting the boundaries between music pieces avoiding oversegmentations within a music piece for automatic segmentation of video data. A number of studies [4–9] have proposed methods for acoustic segmentation that is primarily based upon the similarity and dissimilarity of local feature vectors The performance in these studies was evaluated based on the correct discrimination ratio of frames [7–9] and not on the correct discrimination ratio of music boundaries. The present paper describes an approach for detecting music boundaries, with the goal of automatic segmentation of video data such as musical programs. The feasibility of the proposed method is verified by experiments on music boundary detection using open music datasets supplied by the RWC project [11], and by applying the method to an actual broadcasted music program
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have