AI voice between anthropocentrism and posthumanism: Alexa and voice cloning

Domenico Napolitano

doi:10.1386/jivs_00053_1

Abstract

This article deals with the groundbreaking phenomenon of AI voice, highlighting two possible meanings that are often not problematized: the voice embedded into AI-based devices and the voice created using AI algorithms. In order to clarify the distinctions and the intersections of these two meanings, the article uses an approach inspired by media archaeology and social constructionism. It argues that AI voice as a social phenomenon is constructed by the interaction of a discursive level of representations and a non-discursive level of material practices and operations. The interaction of these two levels results in a tension between anthropocentrism and posthumanism, which is a characteristic of AI voice. Such tension is investigated through two case studies: the commercial of the smart speaker Amazon Alexa and the phenomenon of ‘voice cloning’. While the first is an example of how at a discursive level the ‘voice in the machine’ is represented as a way to ‘personify’ AI technology, the second, which consists in the possibility of reproducing the features of an embodied and personal voice, is an example of how the materialization of that cultural idea depends on the technical possibilities and material practices required by data-driven algorithms.

Full Text