Full-Band Quasi-Harmonic Analysis and Synthesis of Musical Instrument Sounds with Adaptive Sinusoids

Marcelo Caetano,Yannis Stylianou,Athanasios Mouchtaris,George Kafentzis

doi:10.3390/app6050127

Abstract

Sinusoids are widely used to represent the oscillatory modes of musical instrument sounds in both analysis and synthesis. However, musical instrument sounds feature transients and instrumental noise that are poorly modeled with quasi-stationary sinusoids, requiring spectral decomposition and further dedicated modeling. In this work, we propose a full-band representation that fits sinusoids across the entire spectrum. We use the extended adaptive Quasi-Harmonic Model (eaQHM) to iteratively estimate amplitude- and frequency-modulated (AM–FM) sinusoids able to capture challenging features such as sharp attacks, transients, and instrumental noise. We use the signal-to-reconstruction-error ratio (SRER) as the objective measure for the analysis and synthesis of 89 musical instrument sounds from different instrumental families. We compare against quasi-stationary sinusoids and exponentially damped sinusoids. First, we show that the SRER increases with adaptation in eaQHM. Then, we show that full-band modeling with eaQHM captures partials at the higher frequency end of the spectrum that are neglected by spectral decomposition. Finally, we demonstrate that a frame size equal to three periods of the fundamental frequency results in the highest SRER with AM–FM sinusoids from eaQHM. A listening test confirmed that the musical instrument sounds resynthesized from full-band analysis with eaQHM are virtually perceptually indistinguishable from the original recordings.

Highlights

Sinusoidal models are widely used in the analysis [1,2], synthesis [2,3], and transformation [4,5]of musical instrument sounds
Iteration 0 corresponds to Quasi-Harmonic Model (QHM) initialized with the full-band harmonic template, Figure 4 demonstrates that the adaptation of the sinusoids by extended adaptive Quasi-Harmonic Model (eaQHM) increases the signal-to-reconstruction-error ratio (SRER) when compared to QHM
We proposed the full-band quasi-harmonic modeling of musical instrument sounds with adaptive

Summary

Introduction

Sinusoidal models are widely used in the analysis [1,2], synthesis [2,3], and transformation [4,5]. The musical instrument sound is modeled by a waveform consisting of a sum of time-varying sinusoids parameterized by their amplitudes, frequencies, and phases [1,2,3]. Sinusoidal analysis consists of the estimation of parameters, synthesis comprises techniques to retrieve a waveform from the analysis parameters, and transformations are performed as changes of the parameter values. The time-varying sinusoids, called partials, represent how the oscillatory modes of the musical instrument change with time, resulting in a flexible representation with perceptually meaningful parameters. The parameters completely describe each partial, which can be manipulated independently.

Objectives

Methods

Results

Conclusion