Obtaining subjective ratings of voice likability through in-lab listening tests as opposed to mobile-based crowdsourcing

Laura Fernández Gallardo,Rafael Zequeira Jiménez

doi:10.1121/1.4988812

Abstract

Micro-task crowdsourcing has emerged as a powerful approach for rapid collection of user input from a large set of participants at low cost. While previous studies have investigated the acceptability of the crowdsourcing paradigm for obtaining reliable perceptual scores of audio or video quality, this work examines the suitability of crowdsourcing to collect voice likability ratings. Voice likability, or voice pleasantness, can be viewed as a speaker social characteristic that can determine the listener's attitudes and decisions towards the speaker and their message. The collection of valid voice likability labels is crucial for a susccessful automatic prediction of likability from speech features. This work presents different auditory tests that collect likability ratings of a common set of 30 voices. These tests are based on direct scaling and on paired-comparisons, and were conducted in the laboratory under controlled conditions—the typical approach—and via crowdsourcing using micro-tasks. Design considerations are proposed for adapting the laboratory listening tests to a mobile-based crowdsourcing platform. The likability scores obtained by the different test approaches are highly correlated. This outcome motivates the use of crowdsourcing for future listening tests, reducing the costs involved in engaging participants and administering the test on-site.

Full Text