Abstract

This paper surveys the uses of non-speech voice as an interaction modality within sonic applications. Three main contexts of use have been identified: sound retrieval, sound synthesis and control, and sound design. An overview of different choices and techniques regarding the style of interaction, the selection of vocal features and their mapping to sound features or controls is here displayed. A comprehensive collection of examples instantiates the use of non-speech voice in actual tools for sonic interaction. It is pointed out that while voice-based techniques are already being used proficiently in sound retrieval and sound synthesis, their use in sound design is still at an exploratory phase. An example of creation of a voice-driven sound design tool is here illustrated.

Highlights

  • The voice is the primary communication channel among humans

  • Such definition represents an issue in the first place: A semi-automatic classification, performed for instance by machine learning algorithms relying on generic audio features, is likely to produce different categories than a classification built on perceptual basis [4]

  • In a sound synthesizer controlled via singing voice [31], descriptors are extracted from the vocal signal via Shorttime Fourier Transform (STFT) and classified in four groups related to their use for control: Excitation (F0 and energy), Vocal Tract, Voice Quality and Context

Read more

Summary

Introduction

The voice is the primary communication channel among humans. While speech is considered to be the most important form of voice communication, non-speech voice as well is a means to convey a wide array of information. Mimicking and imitating sounds are typical actions that are intuitively performed by means of non-speech voice They require no production or recollection of verbal information and provided that adequate techniques to match the voice to the sounds are made available, vocal imitation is a potentially effective and immediate retrieval strategy. While a structured use of non-speech voice in such context is still missing, partly due to the lack of an engineered approach to the discipline, past and present research focus on exploiting non-speech voice to perform fast prototyping in sonic interaction and to facilitate the communication of audio concepts.

Motivations and related work
Sound retrieval
Category-dependent feature selection
Vocal query strategies
Matching strategies
Examples
Query by whistling
Query by beatboxing
Generic sound retrieval from voice imitation queries
Sound synthesis and control
Vocal features
Roles of the voice
Mapping strategies
Extending voice-driven synthesis to audio mosaicing
Auracle
Voice-controlled plucked bass guitar
Singing-driven interfaces for sound synthesizers
Making music through real-time voice timbre analysis
A voice interface for sound generators
Billaboop
Pitch-based commercial applications
The singing tree
4.4.10 Wahwactor
4.4.12 Synthassist
4.4.13 Intuitive sound design using vocal mimicking
Vocalization for sound design
Vocal sketching: a prototype tool for designing multimodal interaction
Using vocal sketching for designing sonic interactions
VOGST project
VocalSketch: vocally imitating audio concepts
SkAT-VG project
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call