Abstract
We propose a framework for arranging audio objects in recorded music using artificial intelligence (AI) to anticipate the preferences of individual listeners. The signals of audio objects, such as the tracks of guitars and drums in a piece of music, are re-synthesizes in order to provide the preferred spatial arrangements of each listener. Deep learning-based noise suppression ratio estimation is utilized as a technique for enhancing audio objects from mixed signals. Neural networks are tuned for each audio object in advance, and noise suppression ratios are estimated for each frequency band and time frame. After enhancing each audio object, the objects are re-synthesized as stereo sound using the positions of the audio objects and the listener as synthesizing parameters. Each listener supplies simple feedback regarding his/her preferred audio object arrangement using graphical user interface (GUI). Using this listener feedback, the synthesizing parameters are then stochastically optimized in accordance with each listener’s preferences. The system recommends dozens of customized synthesizing parameters, and these parameters can then be adjusted by the listener using a single-dimensional slider bar. Several tests were conducted and the proposed framework scored high marks in subjective evaluations as to whether or not the recommended arrangements were indeed preferable.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.