Abstract

The use of neural networks is becoming increasingly prevalent due to their ability to represent complex relationships and solve complex problems. However, implementing these models in systems that require low-latency output can be challenging, especially for practitioners who are used to developing their models in controlled environments like Python notebooks. Another issue is the high computational cost of complex models, which limits the minimum possible latency. This paper presents approaches for deploying models in audio applications, discusses the advantages and disadvantages of each approach, and presents strategies to reduce the inference cost of models without significantly sacrificing accuracy, using techniques such as model quantization. To illustrate these methods, example implementations of real-time beamforming deconvolution and real-time music DSP processing are shown.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call