Deep audio representations, also known as embeddings, recently became a popular alternative to conventional features like spectrograms for a wide range of audio classification tasks because of their domain-agnostic character and reduced training costs. Still, the usage is often limited to rather computationally intensive system due to the nature of their extraction from large networks. This paper aims to minimize the computational costs of embedding extraction by distilling the knowledge of the OpenL3 audio network to a smaller student network. Results show that the student network maintains comparable performance as the teacher network on various music and ambient noise classification tasks, while reducing the network size by over 90\% and the computational load by five times.