Spectrogram Features Research Articles

The paper deals with the problem of acoustic events classification which is actively applied to the problems of a safe city, smart home, IoT devices, and for the detection of industrial accidence. A solution to improve the accuracy of classifiers without changing their structure and collecting additional data is proposed. The main data source for the experiments was the TUT Urban Acoustic Scenes 2018, Development Dataset. The paper presents the way to increase the accuracy of audio event classification by using the FN-corrector. The FN-corrector is a linear two-stage classifier performing the transformation of the feature space into a linearly separable space and the linear separation of one class from another. If a corrector is applied, the responses of the original classifier generate four classes: positive (P), negative (N), false positive (FP), and false negative (FN). As a result, it becomes possible to train two types of correctors: the FP-corrector separating positive and false positive classifier responses, and the FN-corrector separating negative and false negative classifier responses. In the experiments, the VGGish convolutional neural network was used as the initial classifier. The audio signal is converted into a spectrogram and is fed to the input of the neural network which forms the spectrogram feature description and performs a classification. As an example, two ”confused“ classes are selected to demonstrate the increase in classification accuracy. Using the feature description of audio recordings of these classes, an FN-corrector was built, trained and connected to the original classifier. The response from the classifier, as well as the feature description, has been passed to the corrector input. Next, the corrector translated the feature space into a new basis (into a linearly separable space) and classified the classifier answer responding to the question whether the original classifier makes a mistake on such a feature vector or not. If the original classifier made a mistake, then his answer is changed by the corrector to the opposite, otherwise the answer remains the same. The results of the experiments demonstrated a decrease in the level of class confusion and, accordingly, an increase in the accuracy of the original classifier without changing its structure and without collecting an additional data set. The results obtained can be used on IoT devices that have significant limitations on the size of the models used, as well as in solving the problems of domain adaptation which is relevant in audio analytics

Read full abstract

• A CNN assisted CWT-based spectrogram strategy is proposed for chlorophyll content detection. • The first-order derivative was performed to capture detailed spectral information. • CWT was applied to obtain effective features from spectral data. • CNN model was applied to explore deep features hidden in spectral data. Visible and near-infrared spectroscopy is a nondestructive method for the chlorophyll content detection of potato crops, in which effective feature extraction is a crucial issue for detecting accuracy in the field. This study aimed to explore comprehensive features to improve the accuracy of chlorophyll content detection in potato leaves. A feature extraction method was proposed by assisting continuous wavelet transform (CWT)-based spectrogram and convolutional neural network (CNN) of the deep learning algorithm. In the experiments, the spectral data in the range of 325–1075 nm were measured in the field. A total of 314 potato leaf samples were collected, and the chlorophyll content was determined in four growth periods. The spectral features in the spatial domain and time–frequency domain were considered in data processing. First, we compared features of first-order derivative (FOD) and Savitzky–Golay smoothing (SG) in the spatial domain to select the preprocessing method. Second, in the time–frequency domain, CWT decomposes spectral data into a series of wavelet coefficient curves at various scales and wavelengths. We supposed that combining both scales and wavelengths of the wavelet coefficients could benefit the detection of the chlorophyll content in potato crops. Thus, the spectrograms were established by transforming 1D wavelet coefficient curves into 2D wavelet coefficient spectrograms. The CNN model was applied to explore potential and comprehensive features in the spectrograms. Finally, partial least squares models were established to compare the detection capability of three features including the spatial domain, wavelet coefficients by CWT, and spectrogram features by CNN. The modeling performances showed that the FOD features obtained the coefficient of determination of 0.7856 in the prediction set ( R P 2 ) and the root mean square error in the prediction set ( RMSEP ) with 3.9357; the FOD features performed better than the SG features ( R P 2 = 0.7734, RMSEP = 4.0460); The wavelet coefficients by CWT obtained R P 2 = 0.8293 and RMSEP = 3.5117. The PLS model with spectrogram features by the CNN model performed best and achieved R P 2 = 0.8749 and RMSEP = 3.0067. It preferably captured the deeper features of spectral data compared with features in the spatial domain. This research provides an effective method for leaf chlorophyll content detection in the precision management of potato crops.

Read full abstract

Spectrogram Features Research Articles

Related Topics

Articles published on Spectrogram Features

Acoustic scene classification with multi-temporal complex modulation spectrogram features and a convolutional LSTM network

Identifying the Acoustic Source via MFF-ResNet with Low Sample Complexity

Attention Based Convolutional Neural Network with Multi-frequency Resolution Feature for Environment Sound Classification.

Self-Supervised Learning of Audio Representations From Audio-Visual Data Using Spatial Alignment

An LSTM-Autoencoder Architecture for Anomaly Detection Applied on Compressors Audio Data

Time-frequency analysis-based deep interference classification for frequency hopping system

Identification of Shortwave Radio Communication Behavior Based on Autocorrelation Spectrogram Features

Automatic Classification of Bird Sounds: Using MFCC and Mel Spectrogram Features with Deep Learning

Applying the FN-corrector to improve the quality of audio event classification

Classification of asphyxia infant cry using hybrid speech features and deep learning models

Hand gesture recognition based improved multi-channels CNN architecture using EMG sensors

When sub-band features meet attention mechanism while knowledge distillation for sound classification

SSLNet: A network for cross-modal sound source localization in visual scenes

Predictive evaluation of spectrogram-based vehicle sound quality via data augmentation and explainable artificial Intelligence: Image color adjustment with brightness and contrast

A study on data augmentation in voice anti-spoofing

Automatic Grading of Student’s Presentation Skills based on PowerPoint Presentation and Audio

Deep Learning-Based Artistic Inheritance and Cultural Emotion Color Dissemination of Qin Opera.

Open set recognition of underwater acoustic targets based on GRU-CAE collaborative deep learning network

Deep learning assisted continuous wavelet transform-based spectrogram for the detection of chlorophyll content in potato leaves

ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Spectrogram Features Research Articles

Related Topics

Articles published on Spectrogram Features

Acoustic scene classification with multi-temporal complex modulation spectrogram features and a convolutional LSTM network

Identifying the Acoustic Source via MFF-ResNet with Low Sample Complexity

Attention Based Convolutional Neural Network with Multi-frequency Resolution Feature for Environment Sound Classification.

Self-Supervised Learning of Audio Representations From Audio-Visual Data Using Spatial Alignment

An LSTM-Autoencoder Architecture for Anomaly Detection Applied on Compressors Audio Data

Time-frequency analysis-based deep interference classification for frequency hopping system

Identification of Shortwave Radio Communication Behavior Based on Autocorrelation Spectrogram Features

Automatic Classification of Bird Sounds: Using MFCC and Mel Spectrogram Features with Deep Learning

Applying the FN-corrector to improve the quality of audio event classification

Classification of asphyxia infant cry using hybrid speech features and deep learning models

Hand gesture recognition based improved multi-channels CNN architecture using EMG sensors

When sub-band features meet attention mechanism while knowledge distillation for sound classification

SSLNet: A network for cross-modal sound source localization in visual scenes

Predictive evaluation of spectrogram-based vehicle sound quality via data augmentation and explainable artificial Intelligence: Image color adjustment with brightness and contrast

A study on data augmentation in voice anti-spoofing

Automatic Grading of Student’s Presentation Skills based on PowerPoint Presentation and Audio

Deep Learning-Based Artistic Inheritance and Cultural Emotion Color Dissemination of Qin Opera.

Open set recognition of underwater acoustic targets based on GRU-CAE collaborative deep learning network

Deep learning assisted continuous wavelet transform-based spectrogram for the detection of chlorophyll content in potato leaves

ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model