Abstract
In this paper, an adaptive speech streaming method is proposed to improve the perceived speech quality (PSQ) of voice over wireless multimedia sensor network (WMSNs). First of all, the proposed method estimates the PSQ of the received speech data under different network conditions that are represented by the packet loss rates (PLRs). Simultaneously, the proposed method classifies the speech signal as either an onset or a nononset frame. Based on the estimated PSQ and the speech class, it determines an appropriate bit rate for the redundant speech data (RSD) that are transmitted with the primary speech data (PSD) to help reconstruct the speech signals of any lost frames. In particular, when the estimated PLR is high, the bit rate of the RSD should be increased by decreasing that of the PSD. Thus, the bandwidth of the PSD is changed from wideband to narrowband, and an artificial bandwidth extension technique is applied to the decoded narrowband speech. It is shown from the simulation that the proposed method significantly improves the decoded speech quality under packet loss conditions in a WMSN, compared to a decoder-based packet loss concealment method and a conventional redundant speech transmission method.
Highlights
Because of the rapid development of low power and highly integrated digital electronic technologies, wireless multimedia sensor networks (WMSNs) are capable of retrieving audio and/or video streams as they interconnect sensor nodes equipped with multimedia devices such as cameras and microphones
A lost speech signal was recovered using the redundant speech data (RSD) for a high packet loss rates (PLRs). This method provided the improved overall perceived speech quality (PSQ) under various PLRs within equivalent transmission bandwidth. This method suffered from degraded PSQ when the speech signal bandwidth changed from wideband to narrowband due to the decreased bit rate of primary speech data (PSD) by assigning more bit rate to RSD
The proposed method first classified each frame of input speech signals as either an onset frame or a nononset frame
Summary
Because of the rapid development of low power and highly integrated digital electronic technologies, wireless multimedia sensor networks (WMSNs) are capable of retrieving audio and/or video streams as they interconnect sensor nodes equipped with multimedia devices such as cameras and microphones. This method provided the improved overall PSQ under various PLRs within equivalent transmission bandwidth Despite the advantages, this method suffered from degraded PSQ when the speech signal bandwidth changed from wideband to narrowband due to the decreased bit rate of PSD by assigning more bit rate to RSD. To overcome this problem, the proposed method in this paper incorporates an artificial bandwidth extension (ABE) technique [18] to the decoded narrowband speech to prevent the quality of the decoded speech from being degraded by the bandwidth deficiency of speech.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Distributed Sensor Networks
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.