Abstract

Thousands of species use vocal signals to communicate with one another. Vocalizations carry rich information, yet characterizing and analyzing these complex, high-dimensional signals is difficult and prone to human bias. Moreover, animal vocalizations are ethologically relevant stimuli whose representation by auditory neurons is an important subject of research in sensory neuroscience. A method that can efficiently generate naturalistic vocalization waveforms would offer an unlimited supply of stimuli with which to probe neuronal computations. Although unsupervised learning methods allow for the projection of vocalizations into low-dimensional latent spaces learned from the waveforms themselves, and generative modeling allows for the synthesis of novel vocalizations for use in downstream tasks, we are not aware of any model that combines these tasks to synthesize naturalistic vocalizations in the waveform domain for stimulus playback. In this paper, we demonstrate BiWaveGAN: a bidirectional generative adversarial network (GAN) capable of learning a latent representation of ultrasonic vocalizations (USVs) from mice. We show that BiWaveGAN can be used to generate, and interpolate between, realistic vocalization waveforms. We then use these synthesized stimuli along with natural USVs to probe the sensory input space of mouse auditory cortical neurons. We show that stimuli generated from our method evoke neuronal responses as effectively as real vocalizations, and produce receptive fields with the same predictive power. BiWaveGAN is not restricted to mouse USVs but can be used to synthesize naturalistic vocalizations of any animal species and interpolate between vocalizations of the same or different species, which could be useful for probing categorical boundaries in representations of ethologically relevant auditory signals.NEW & NOTEWORTHY A new type of artificial neural network is presented that can be used to generate animal vocalization waveforms and interpolate between them to create new vocalizations. We find that our synthetic naturalistic stimuli drive auditory cortical neurons in the mouse equally well and produce receptive field features with the same predictive power as those obtained with natural mouse vocalizations, confirming the quality of the stimuli produced by the neural network.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.