Abstract

In this paper, the problem of audio semantic communication over wireless networks is investigated. In the considered model, wireless edge devices transmit large-sized audio data to a server using semantic communication techniques. The techniques allow devices to only transmit audio semantic information that captures the contextual features of audio signals. To extract the semantic information from audio signals, a wave to vector (wav2vec) architecture based autoencoder is proposed, which consists of convolutional neural networks (CNNs). The proposed autoencoder enables high-accuracy audio transmission with small amounts of data. To further improve the accuracy of semantic information extraction, federated learning (FL) is implemented over multiple devices and a server. Simulation results show that the proposed algorithm can converge effectively and can reduce the mean squared error (MSE) of audio transmission by nearly 100 times, compared to a traditional coding scheme.

Highlights

  • Future wireless networks require high data rate and massive connection for emerging applications such as the Internet of Things (IoT) (Saad et al, 2020; Lee et al, 2017; Hu et al, 2021; Al-Garadi et al, 2020; Huang et al, 2021)

  • Both the encoder and decoder update the parameters with stochastic gradient descent (SGD) once after a batch of data passes through the autoencoder.The training process of each local model can be shown in Algorithm 1, where η in (8) is the learning rate

  • We have developed an federated learning (FL) trained model over an audio semantic communication (ASC) architecture in the wireless network

Read more

Summary

INTRODUCTION

Future wireless networks require high data rate and massive connection for emerging applications such as the Internet of Things (IoT) (Saad et al, 2020; Lee et al, 2017; Hu et al, 2021; Al-Garadi et al, 2020; Huang et al, 2021). In Xie and Qin (2021), the authors developed a new distributed text semantic communication system for IoT devices and they showed that nearly 20 times compression ratio can be achieved without any performance degradation Most of these existing works (Shannon, 1948; Bao et al, 2011; Shi et al, 2020; Guler et al, 2018; Uysal et al, 2021; Xie et al, 2020; Xie and Qin, 2021) that focused on the use of semantic communication for text data processing did not consider how to extract the meaning out of the audio data.

SYSTEM MODEL AND PROBLEM FORMULATION
ASC Encoder
Wireless Channel
ASC Decoder
ASC Objective
AUDIO SEMANTIC ENCODER AND DECODER
Wav2vec Architecture Based Autoencoder
FL Training Method
Complexity Analysis
SIMULATION AND PERFORMANCE ANALYSIS
CONCLUSION
Findings
DATA AVAILABILITY STATEMENT
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call