Abstract

Deep learning models that require vast amounts of training data struggle to achieve good animal sound classification (ASC) performance. Among recent few-shot ASC methods to address the data shortage problem regarding animals that are difficult to observe, model-agnostic meta-learning (MAML) has shown new possibilities by encoding common prior knowledge derived from different tasks into the model parameter initialization of target tasks. However, when the knowledge on animal sounds is difficult to generalize due to its diversity, MAML exhibits poor ASC performance due to the static initialization setting. In this paper, we propose a novel task-adaptive parameter transformation scheme called few-shot ASC. TAPT generates transformation variables while learning common knowledge and uses the variables to make parameters specific to the target task. Owing to this transformation, TAPT can reduce overfitting and enhance adaptability, training speed, and performance in heterogeneous tasks compared to MAML. In experiments on two public datasets on the same backbone network, we show that TAPT outperforms the existing few-shot ASC schemes in terms of classification accuracy, and in particular a performance improvement of 20.32% compared to the state-of-the-art scheme. In addition, we show that TAPT is robust to hyperparameters and efficient for training.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call