Abstract
The signal characteristics of speech collected over microphone depends on the distance between the speaker and sensor, and also on the presence of other background acoustic sources. In the present work, if the microphone is kept close to the speaker, then it is termed as Foreground scenario, otherwise Distant scenario. Even though, presence of other background acoustic sources affect the signal characteristics in both cases, the foreground scenario offers some advantages due to proximity of desired speaker to the microphone which results in better manifestation of speech characteristics. The fundamental question is, when the collected speech is termed as foreground speech? In this work a definition for foreground speech is established by considering epochs as the basis. The epochs are the instants of significant excitation and the regions around them are least affected by other interfering sources compared to other regions of speech signal. Hence, an attempt is made to substantiate foreground speech based on the characteristics of epochs. The interfering background sources can still have an adverse effect on foreground speech, especially when their amplitudes are at comparable levels. Consequently, a temporal processing method is suggested using epochs as the anchor points to perceptually enhance the foreground speech and to minimize the effect of background sources.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have