The authors initially employs the fast Fourier transform (FFT) approach to transforming voice inputs into digital signals before integrating a speech recognition solution (which includes two models: the hidden Markov model (HMM) and the artificial neural network (ANN)). To achieve standard-tone identification of voice signals and digitally store speech, the authors then incorporated a 2048-bit Rivest-Shamir-Adleman (RSA) encryption method to encrypt and decrypt digital speech. The authors’ building team constructed the program using a 256-bit advanced encryption standard - Galois counter mode (AES-GCM) encryption method to assure the application’s effectiveness. The authors successfully created a voice recognition application according to the HMM of ANN. The collected findings suggest that the authors’ secure speech recognition program (named soft voice - RSA) has improved in terms of safety, keeping speech material secret, and speed. It takes roughly 0.2 s to generate a 2048-bit RSA key pair that exceeds the National Institute of Standards and Technology (NIST) standard, 700-1070 ms to process speech, 1-4 ms to encrypt 2048-bit RSA, 6-8 ms to decrypt 2048-bit RSA.
Read full abstract