Abstract

Audio measurements are fundamental to daily life and have been used to perform a variety of tasks including the classification of human sounds (e.g. talking or coughing) and environmental acoustic monitoring. Audio visualization methods have been introduced to represent both the time- and frequency-domain information of a recording. In this paper we introduce Chaos Game Representation (CGR) to investigate possible reoccurring local and global patterns within audio measurements to supplement current audio visualization methods and for possible use in the training and evaluation of learning algorithms. A major challenge of the application of CGR within audio-space is quantization. Here, we leverage the non-uniform μ-law ( μ = 255) quantization as the basis for the first quantized audio CGR representation. We propose a 256-nodal arrangement of the quantized states from an audio measurement for playing the Chaos Game to generate visualizations that capture both local and global sequential time-series information. CGRs were generated for 287,756 individual audio measurements (pure sinusoids, linear & quadratic chirps, and ambient audio measurements from DCASE2018). A typology of visually observable patterns is discussed describing the relationship between the time-series audio signal and their resulting CGR visualizations. These images may be leveraged for training image-based classifiers for audio classification tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call