Abstract

AbstractIn the preceding paper, we have proposed a method for auditory scene analysis, in which the instantaneous frequency, frequency change rate, and amplitude change rate in time‐frequency space are intensified into a multipeak probability density distribution by voting method and the grouping into streams of mixed sounds is realized. In this paper, as the main point of the second half of this method, we will introduce the assumption that the stream parameters vary slowly according to the known dynamics and propose an integration method on the time axis, in which the probability density distribution of the stream parameters is optimally estimated in time series by a nonparametric Kalman filter. By doing so, the mechanism of higher auditory scene analysis such as enhancement of the accuracy of the stream parameters, interpolation and connection of the breaks of the streams, and introduction of a priori knowledge into stream selection can be realized. Moreover, the separation and reconstruction system of sounds which correspond to streams is constructed, and the proposed technique is verified by fundamental experiments for synthesized sounds or musical sounds and voices. © 2002 Wiley Periodicals, Inc. Syst Comp Jpn, 33(10): 83–94, 2002; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/scj.1160

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.