Context. The current scientific problem of extracting biometric characteristics of a user of a voice authentication system, which can significantly increase its reliability, is considered. There has been performed estimation of formant information from the voice signal, which is a part of the user template in the voice authentication system and is widely used in the processing of speech signals in other applications, including in the presence of interfering noise components. The work is distinguished by the investigation of a polyharmonic signal. Objective. The purpose of the work is to develop procedures for generating formant information based on the results of calculating the autocorrelation function of the analyzed fragment of the voice signal and their subsequent spectral analysis. Method. The procedures for generating formant information in the process of digital processing of voice signal are proposed. Initially, the autocorrelation function of the analyzed fragment of the voice signal is calculated. Based on the results of the autocorrelation function estimation, the amplitude-frequency spectrum is calculated, from which the formant information is extracted, for example, by means of threshold processing. When the signal-to-noise ratio of the analyzed voice signal fragment is low, it is advisable to iteratively calculate the autocorrelation function. The latter allows increasing the signal-to-noise ratio and the efficiency of formant information extraction. However, each subsequent iteration of the autocorrelation function calculation is associated with an increase in the required computational resource. The latter is conditioned by the doubling of the amount of processed data at each iteration. Results. The developed procedures for generating formant information were investigated both in the processing of model and experimental voice signals. The model signals had a low signal-to-noise ratio. The proposed procedures allow to determine more precisely the width of the spectrum of extracted formant frequencies, significantly increase the number of extracted formants, including cases at low signal-to-noise ratio. Conclusions. The conducted model experiments have confirmed the performance and reliability of the proposed procedures for extracting formant information both in the processing of model and experimental voice signals. The results of the research allow to recommend their use in practice for solving problems of voice authentication, speaker differentiation, speech and gender recognition, intelligence, counterintelligence, forensics and forensic examination, medicine (diseases of the speech tract and hearing). Prospects for further research may include the creation of procedures for evaluating formant information based on phase data of the processed voice signal.
Read full abstract