Developing a Deep Learning Sound Classification System for a Smart Farming

Oleksii Kudin,Anastasiia Kryvokhata,Vitaliy Ivanovich Gorbenko

doi:10.1149/ma2020-01261853mtgabs

Oleksii Kudin, Anastasiia Kryvokhata + Show 1 more

https://doi.org/10.1149/ma2020-01261853mtgabs

Copy DOI

Abstract

The Internet of Things (IoT) and Machine Learning (ML) are a promising technologies for automation in the different domains e.g. Smart Farming. Such systems could be used for determining noise emission, animal detection and classification, monitoring states of bees in beehive and etc. As rule the characteristics of that systems are some controversial: compactness (deploying on the single-board computers) and real time classification and prediction. Therefore, the software for IoT and ML applications should be sufficiently efficient and not demanding on large computing resources. This work are focused on machine hearing system allows to classify natural sounds in the Smart Farming software application.The original acoustic signal is converted to the frames of a certain length. Receiving a compact representation of the acoustic characteristics of a signal is the aim of the feature extraction stage. This stage exploits special coefficients such as Zero-crossing rate, Spectrum shape, Short-Time Fourier Transform and Mel-frequency cepstral coefficients. Audio classification traditionally involves such machine learning methods as K-means, support vector machine (SVM), decision trees etc. Deep neural networks can be used on both raw acoustic signal and features extracted from it. The accuracy estimation stage deploys quality assessment methods.The data preprocessing stage includes calculating of mel frequency cepstral coefficient (MFCC) for the giving sound files. This approach allows to unify and simplify the sound files presentation in the memory. Further MFCC arrays are feeding to the convolutional neural net. It is important at this stage to configure the network optimally for the most compact storage. This is due to the need to use platforms such as Raspberry Pi to deploying neural network.For neural network software implementation Python library Keras is used. For data preprocessing Python library Librosa is used. The hyperparameters of the neural network have been defined in computational experiments and the optimal combination is two Conv2D layers and three Dense layers.The accuracy of model predictions for each class has been examined by using a confusion matrix and the satisfactory classification accuracy has been defined.It has been established the convolutional neural network ensemble built to solve the problem of acoustic data classification is quite effective. The accuracy of the model on the test data is 95%. It should be noted that the neural network structure with only two packets of convolution-activation-sub-sampling layers was sufficient to solve this problem. The system can be used in Smart Farming applications to filter out unnecessary sounds.

Full Text