Abstract

The domestic cat (Feliscatus) is one of the most attractive pets in the world, and it generates mysterious kinds of sound according to its mood and situation. In this paper, we deal with the automatic classification of cat sounds using machine learning. Machine learning approach for the classification requires class labeled data, so our work starts with building a small dataset named CatSound across 10 categories. Along with the original dataset, we increase the amount of data with various audio data augmentation methods to help our classification task. In this study, we use two types of learned features from deep neural networks; one from a pre-trained convolutional neural net (CNN) on music data by transfer learning and the other from unsupervised convolutional deep belief network that is (CDBN) solely trained on a collected set of cat sounds. In addition to conventional GAP, we propose an effective pooling method called FDAP to explore a number of meaningful features. In FDAP, the frequency dimension is roughly divided and then the average pooling is applied in each division. For the classification, we exploited five different machine learning algorithms and an ensemble of them. We compare the classification performances with respect following factors: the amount of data increased by augmentation, the learned features from pre-trained CNN or unsupervised CDBN, conventional GAP or FDAP, and the machine learning algorithms used for the classification. As expected, the proposed FDAP features with larger amount of data increased by augmentation combined with the ensemble approach have produced the best accuracy. Moreover, both learned features from pre-trained CNN and unsupervised CDBN produce good results in the experiment. Therefore, with the combination of all those positive factors, we obtained the best result of 91.13% in accuracy, 0.91 in f1-score, and 0.995 in area under the curve (AUC) score.

Highlights

  • The sound generation and perception systems of animals have evolved to help them to survive in their environment

  • The data-driven machine learning approach for acoustic signal has been of great interest to researchers in recent years, and some studies have been conducted on animal sound classification [1,2]

  • We compare the classification performances with respect to the following factors; the amount of data increased by augmentation, the learned features from pre-trained convolutional neural net (CNN) or unsupervised

Read more

Summary

Introduction

The sound generation and perception systems of animals have evolved to help them to survive in their environment. The intentional sounds that are generated by animals should be distinct from the random sounds of the environment. Some animals have special sensory capabilities, such as vision, sights, feeling, and awareness of natural changes as compared to human beings. Animal sounds can be helpful for human beings in terms of security, prediction of natural disasters, and intimate interactions if we are able to recognize them properly. The data-driven machine learning approach for acoustic signal has been of great interest to researchers in recent years, and some studies have been conducted on animal sound classification [1,2]. Several studies [6,7,8,9,10,11,12]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.