Abstract
Rapid technological advances in artificial intelligence are creating opportunities for real-time algorithmic modulations of a person’s facial and vocal expressions, or ‘deep-fakes’. These developments raise unprecedented societal and ethical questions which, despite much recent public awareness, are still poorly understood from the point of view of moral psychology. We report here on an experimental ethics study conducted on a sample of N = 303 participants (predominantly young, western and educated), who evaluated the acceptability of vignettes describing potential applications of expressive voice transformation technology. We found that vocal deep-fakes were generally well accepted in the population, notably in a therapeutic context and for emotions judged otherwise difficult to control, and surprisingly, even if the user lies to their interlocutors about using them. Unlike other emerging technologies like autonomous vehicles, there was no evidence of social dilemma in which one would, for example, accept for others what they resent for themselves. The only real obstacle to the massive deployment of vocal deep-fakes appears to be situations where they are applied to a speaker without their knowing, but even the acceptability of such situations was modulated by individual differences in moral values and attitude towards science fiction.This article is part of the theme issue ‘Voice modulation: from origin and mechanism to social impact (Part II)’.
Highlights
The human facial and vocal expressions have evolved as signals to inform and manipulate others [1,2]
We reported here on an experimental ethics study in which N = 303 online participants evaluated the acceptability of vignettes describing potential applications of expressive voice transformation technology
We found that vocal deep-fakes were generally well accepted, notably in a therapeutic context; when they corrected negative emotions rather than enhanced positive emotions; and when they manipulated a speaker’s production rather than perception
Summary
The human facial and vocal expressions have evolved as signals to inform and manipulate others [1,2]. Because expressive behaviours are often thought to provide genuine cues about the sender’s emotional states [17], the ability to arbitrarily manipulate these displays opens avenues for deception: one may use, e.g. a facial filter to fake a smile despite having no intent to affiliate, or a voice transformation to appear more certain than one really is Technologies able to trigger such unconscious reactions are intrinsically manipulative, as people may not be able to identify the transformation as the cause for their subsequent behaviour They raise concerns about transparency, as their deployment in virtual conversations lends itself to situations where a speaker does not know how their interlocutor is hearing or seeing them, i.e. whether a transformation of their own voice or face is applied without their knowing.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Philosophical transactions of the Royal Society of London. Series B, Biological sciences
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.