Abstract

Despite the remarkable progress of deep learning on speech recognition and music processing, it is still challenging to classify general audio signals due to the high cost of collection and annotation of the samples. The ability to learn discriminative features from a small dataset makes deep metric learning a promising method for general audio classification. However, because of the difficulty in mining informative sample pairs, it usually suffers from slow convergence or even poor local minima. In this letter, to improve classification performance by exploiting the advantages of both the weight-based loss and the metric-based loss, we proposed a multi-positive metric loss and a framework to joint it with the common softmax loss. The proposed method eliminates the need for sub-loss weighting by measuring the similarity between samples in a consistent probabilistic form. It also enhances the classification performance by improving the estimation of the intra-class and inter-class relationships from multiple positive samples. Finally, we evaluated the proposed method on the ShipsEar dataset and the Ocean Networks Canada dataset, and the results verified its effectiveness.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call