Abstract

Tabla is a sophisticated, centuries-old percussion tradition from North India based on timbral sequences. We model these sequences in a predictive framework with Variable-length Markov Models (VLMMs). Using a database containing nearly 30,000 strokes in 35 compositions, we show that VLMMs have high predictive accuracy, with an average perplexity of 1.80, and median perplexity of 1.19, on a task with 42 distinct symbols. This basic framework is extended by the introduction of several new smoothing techniques that determine how to integrate predictions from the different order models. The model is then extended to include parallel representations of the sequence, a technique known as Multiple Viewpoint modelling. The work is then extended to the problem of recognizing strokes from audio. In this hidden context, the identity of the previous stroke is not revealed at each time step. A Variable-length Hidden Markov Model (VLHMM) is used to determine the next-symbol distribution that is used in computing the perplexity. We detail how the forward probabilities can be efficiently computed for the VLHMM by traversing a prediction suffix tree (PST) that is used to represent sequences. Using a VLHMM with a maximum order of 3, we obtain an average perplexity of 2.31, with a median of 1.16 on a nine-target task. To the best of our knowledge, this is the first use of Variable-length Hidden Markov Models for music modelling or prediction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call