Abstract

The paper describes the results of numerical experiments on the decomposition of some sounds and words of a person's speech into separate waves with slowly drifting amplitudes, frequencies, phases and their reverse summation in order to identify factors that are both important and not important for automatic speech recognition. The objective of this study is investigation the mathematical features of various sounds and words of human speech without using the method of Fourier transforms. Instead of Fourier transforms, the approximation method developed earlier by the author is used. This method allow expand of periodic or almost periodic functions to sum of modes with slowly varying (drifting) parameters - amplitudes, frequencies, phases. Such decompositions were carried out for samples of vowel sounds, simple syllables and words. After that, the reverse summation of the drifting modes was carried out. Before summation the modes, their parameters were deliberately distorted in order to identify factors, both significant and insignificant for the essence of sounds. The functions obtained in this way are of the nature of artificial sound functions It turned out, that for vowel sounds amplitudes of modes may be averaged over long time without lost the essence of sounds. The phases of sounds may be changed by adding any random constant value without lost their essence too. It has been found that In many cases, for to find the parameters, it is convenient use not the sound function itself, but its time derivative. It was shown, that amplitude of summing modes of sound function may be represent as sum of several Gaussian function as for simple sounds, as for syllables. The appropriate mathematical formulas and tables of parameters of artificial sound functions presented

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call