Abstract

Hearing a species in a tropical rainforest is much easier than seeing them. If someone is in the forest, he might not be able to look around and see every type of bird and frog that are there but they can be heard. A forest ranger might know what to do in these situations and he/she might be an expert in recognizing the different type of insects and dangerous species that are out there in the forest but if a common person travels to a rain forest for an adventure, he might not even know how to recognize these species, let alone taking suitable action against them. In this work, a model is built that can take audio signal as input, perform intelligent signal processing for extracting features and patterns, and output which type of species is present in the audio signal. The model works end to end and can work on raw input and a pipeline is also created to perform all the preprocessing steps on the raw input. In this work, different types of neural network architectures based on Long Short Term Memory (LSTM) and Convolution Neural Network (CNN) are tested. Both are showing reliable performance, CNN shows an accuracy of 95.62% and Log Loss of 0.21 while LSTM shows an accuracy of 93.12% and Log Loss of 0.17. Based on these results, it is shown that CNN performs better than LSTM in terms of accuracy while LSTM performs better than CNN in terms of Log Loss. Further, both of these models are combined to achieve high accuracy and low Log Loss. A combination of both LSTM and CNN shows an accuracy of 97.12% and a Log Loss of 0.16.

Highlights

  • A long time ago, Rainforest formed after tropical temperatures dropped down drastically when the Atlantic Ocean had widened enough to provide a warm, moist climate to the Amazon basin [1]

  • A total of 4727 audio signals were recorded and later this data was manually checked by experts and for ~1100 of these audio files, they found that their device detected correct species and information for these ~1100 files were stored in a separate file which contains the name of the audio file, species which is present in the audio, maximum and minimum frequency of the audio and the time at which a species was heard in that audio file

  • 3.2.1 Long Short Term Memory This section uses LSTM analysis for the considered dataset. This model is created in such a way that it captures sequential info of frequency of the sound of a species (LSTM layer with 50 units) along and calculates average frequency over the sample (Global Average Pooling)

Read more

Summary

Introduction

A long time ago (i.e., more than 500 lakh years ago), Rainforest formed after tropical temperatures dropped down drastically when the Atlantic Ocean had widened enough to provide a warm, moist climate to the Amazon basin [1]. Rainforest includes a large variety of species which accounts for 50% of. People have tried to become experts in all kinds of species by going through the training program for a forest ranger. There are many species which are only found in tropical rainforest and few people know about them and so, even forest rangers are being deprived of this type of training. It takes years of real-world experience to become an expert but machine learning can help in automating this process and not let us worry about knowing all bits and pieces of every species. A model can be built that can act as a companion and recognize every species that is around us while roaming in a forest

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call