Abstract

In order to improve the retrieval efficiency and accuracy of the existing encrypted speech retrieval methods, and improve the semantic representation of speech features and classification performance, a classification retrieval method for encrypted speech based on deep neural network (DNN) and deep hashing is proposed. Firstly, the speech files are classified according to the category tags, and the speech files are encrypted by Rossler chaotic map method and uploaded to the cloud encrypted speech library. Secondly, the Log-Mel spectrogram features of the original speech are extracted, and extract deep semantic features and generate classification results through the trained convolutional neural network (CNN) and convolutional recurrent neural network (CRNN). Finally, the semantic feature hash code is obtained through the constructed hash function, combined with the category hash code encoded by One Hot coding to obtain the final deep hashing binary code, and uploaded to the deep hashing index table. When retrieval, the deep hashing binary code of the query speech is obtained, and the “two-stage” classification retrieval strategy and the normalized Hamming distance algorithm are used to match the semantic feature hash. Experimental results show that the proposed two DNN coding models have excellent feature learning performance, and has better recall rate, precision rate and retrieval efficiency.

Highlights

  • With the continuous advancement of Internet and cloud computing technology, more and more companies and individuals choose to store multimedia data in the cloud

  • In order to evaluate the performance of the proposed method, this paper uses the speech data from THCHS-30 [31], a Chinese speech database released by the center for speech and language technology (CSLT) of Tsinghua University to conduct experiments, the speech sampling frequency is 16kHz and the sampling size is 16bits, the content is 1,000 news fragments with different contents, the length of each speech is about 10 seconds, and the total length of all speech in the database is about 30 hours

  • The experiment uses the deep hashing binary code generated by the speech data processed by content preserving operations (CPOs) for testing, firstly, the proposed convolutional neural network (CNN)/convolutional recurrent neural network (CRNN) coding model is used to encode the speech that processed by various CPOs and calculate its AP value, and the mean Average Precision is calculated by the AP value

Read more

Summary

INTRODUCTION

With the continuous advancement of Internet and cloud computing technology, more and more companies and individuals choose to store multimedia data (text, image and speech, etc.) in the cloud. The main innovations of this paper are as follows: (1) Two end-to-end deep hashing coding models, such as CNN model and CRNN model, are designed to improve the semantics of speech features and generate highquality deep hashing binary code to improve the retrieval accuracy and retrieval efficiency of the retrieval method; (2) Integrating semantic feature learning and hash coding into an overall learning framework, it can directly generate the deep feature hash code of input speech data, and One Hot coding is used to obtain the category hash code, and add it to the construction of deep hashing binary code, which improves the efficiency of speech deep feature extraction and can generate high-quality deep hashing binary code; (3) A "two-stage" classification retrieval strategy is proposed, the category hash is first retrieved to find the candidate set of the same category, and the semantic feature hash is retrieved from the candidate set, which further improves the retrieval efficiency and retrieval accuracy of the retrieval method.

RELATED WORKS
LOG-MEL SPECTROGRAM FEATURE
CHAOTIC ENCRYPTION BASED ON ROSSLER MAP
SYSTEM MODEL
ESTABLISHMENT OF ENCRYPTED SPEECH LIBRARY
DEEP HASHING CODING MODEL CONSTRUCTION
HASH FUNCTION LEARNING
CONSTRUCTION OF DEEP HASHING BINARY CODES
SPEECH RETRIEVAL AND DECRYPTION
EXPERIMENTAL ENVIRONMENT
PERFORMANCE ANALYSIS OF DEEP HASHING CODING MODEL
SYSTEM RETRIEVAL PERFORMANCE ANALYSIS
Proposed Method
SYSTEM RETRIEVAL EFFICIENCY ANALYSIS
Method
CONCLUSIONS AND FUTURE WORK
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call