SampleCNN: End-to-End Deep Convolutional Neural Networks Using Very Small Filters for Music Classification

Jongpil Lee,Juhan Nam,Jiyoung Park,Keunhyoung Kim

doi:10.3390/app8010150

Jongpil Lee, Juhan Nam + Show 2 more

Open Access

https://doi.org/10.3390/app8010150

Copy DOI

Abstract

Convolutional Neural Networks (CNN) have been applied to diverse machine learning tasks for different modalities of raw data in an end-to-end fashion. In the audio domain, a raw waveform-based approach has been explored to directly learn hierarchical characteristics of audio. However, the majority of previous studies have limited their model capacity by taking a frame-level structure similar to short-time Fourier transforms. We previously proposed a CNN architecture which learns representations using sample-level filters beyond typical frame-level input representations. The architecture showed comparable performance to the spectrogram-based CNN model in music auto-tagging. In this paper, we extend the previous work in three ways. First, considering the sample-level model requires much longer training time, we progressively downsample the input signals and examine how it affects the performance. Second, we extend the model using multi-level and multi-scale feature aggregation technique and subsequently conduct transfer learning for several music classification tasks. Finally, we visualize filters learned by the sample-level CNN in each layer to identify hierarchically learned features and show that they are sensitive to log-scaled frequency.

Highlights

Convolutional Neural Networks (CNN) have been applied to diverse machine learning tasks
We evaluate the extended model in transfer learning settings where the features extracted from SampleCNN can be used for three different datasets in music genre classification and music auto-tagging
An interesting finding from the result of the frame-level raw waveform model is that when the filter length is larger than the stride, the accuracy is slightly lower than the models with the same filter length and stride

Summary

Introduction

Convolutional Neural Networks (CNN) have been applied to diverse machine learning tasks. The benefit of using CNN is that the model can learn hierarchical levels of features from high-dimensional raw data. While the word-level embedding plays a vital role in language processing [2], it has limitations in that the embedding space is learned separately from the word-level model. To handle this problem, character-level language models that learn from the bottom-level raw data (e.g., alphabet characters) were proposed and showed that they can yield comparable results to the word-level learning models [3,4]

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Jan 22, 2018
Citations: 113	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

SampleCNN: End-to-End Deep Convolutional Neural Networks Using Very Small Filters for Music Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Comparison and Analysis of SampleCNN Architectures for Audio Classification
Taejun Kim ... Juhan Nam
IEEE Journal of Selected Topics in Signal Processing | VOL. 13
Taejun Kim, et. al.Taejun Kim ... Juhan Nam
01 May 2019
IEEE Journal of Selected Topics in Signal Processing | VOL. 13

A convolutional fuzzy min-max neural network
Trupti R Chavan ... Abhijeet V Nandedkar
Neurocomputing | VOL. 405
Trupti R Chavan, et. al.Trupti R Chavan ... Abhijeet V Nandedkar
17 May 2020
Neurocomputing | VOL. 405

Clinically Relevant Vulnerabilities of Deep Machine Learning Systems for Skin Cancer Diagnosis
Xinyi Du-Harpur ... Magnus D Lynch
Journal of Investigative Dermatology | VOL. 141
Xinyi Du-Harpur, et. al.Xinyi Du-Harpur ... Magnus D Lynch
12 Sep 2020
Journal of Investigative Dermatology | VOL. 141

DIAGNOSIS OF BRAIN TUMOR USING MULTISCALE CONVOLUTION NEURAL NETWORK
Homayoon Yektaei ... Hanieh Yektaei
Biomedical Engineering: Applications, Basis and Communications | VOL. 34
Homayoon Yektaei, et. al.Homayoon Yektaei ... Hanieh Yektaei
17 Aug 2022
Biomedical Engineering: Applications, Basis and Communications | VOL. 34

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SampleCNN: End-to-End Deep Convolutional Neural Networks Using Very Small Filters for Music Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences