Music source separation (MSS) is the process of splitting various components of a musical piece into individual tracks. This process combines the fields of acoustics and machine learning to extract useful data from music, which assists in a variety of music information retrieval tasks. In the past decade, many methods have been employed to perform MSS with varying levels of success. This research explores the use of dynamic time warping (DTW) for MSS tasks in the time domain. DTW is an algorithm that performs a temporal alignment of two time series to measure their similarity. It is unique in that the algorithm will minimize the Euclidean distance between the two sequences by stretching or compressing them to optimize similarity. This makes DTW a distinctive method for MSS, as it operates entirely in the time domain and classifies sounds without the interference of time warping. The research performed focuses only on the separation of transient, percussive sounds. Measurements taken with a drum kit and a selection of digital drum sounds served as the foundation for tests of the algorithm. The results of this research illustrate the potential of DTW in time domain MSS applications.
Read full abstract