Stacked Convolutional Neural Networks Research Articles

In this study, a novel multi-objective speech enhancement algorithm is proposed. First, we construct a deep learning architecture based on a stacked and temporal convolutional neural network (STCNN). Second, the main log-power spectra (LPS) features are input into a stacked convolutional neural network (SCNN) to extract advanced abstract features. Third, an improved power function compression Mel-frequency cepstral coefficient (PC-MFCC) feature—more consistent with human hearing characteristics than a Mel-frequency cepstral coefficient (MFCC)—is proposed. Then, a temporal convolutional neural network (TCNN) uses PC-MFCC and learned features from SCNN as input, and separately predicts a clean LPS, PC-MFCC and Ideal Ratio Mask (IRM). In this training phase, PC-MFCC constrains the LPS and IRM through a loss function to obtain the optimal network structure. Finally, IRM-based post-processing is used on the estimated clean LPS and IRM, which adjusts the weight between the above LPS and IRM to synthesise enhanced speech based on voice presence information. A series of experiments show that PC-MFCC is effective and shows complementarity with LPS in speech enhancement tasks. The proposed STCNN architecture has a higher speech enhancement performance than the comparative neural network models with good feature extraction and sequence modelling capabilities. Additionally, IRM-based post-processing further enhances the listening quality of reconstructed speech. Compared with the contrasting algorithm, the speech quality and intelligibility of enhanced speech based on the proposed multi-objective speech enhancement algorithm are further improved.

Finger-vein biometrics has been extensively investigated for personal verification. A challenge is that the finger-vein acquisition is affected by many factors, which results in many ambiguous regions in the finger-vein image. Generally, the separability between vein and background is poor in such regions. Despite recent advances in finger-vein pattern segmentation, current solutions still lack the robustness to extract finger-vein features from raw images because they do not take into account the complex spatial dependencies of vein pattern. This paper proposes a deep learning model to extract vein features by combining the Convolutional Neural Networks (CNN) model and Long Short-Term Memory (LSTM) model. Firstly, we automatically assign the label based on a combination of known state of the art handcrafted finger-vein image segmentation techniques, and generate various sequences for each labeled pixel along different directions. Secondly, several Stacked Convolutional Neural Networks and Long Short-Term Memory (SCNN-LSTM) models are independently trained on the resulting sequences. The outputs of various SCNN-LSTMs form a complementary and over-complete representation and are conjointly put into Probabilistic Support Vector Machine (P-SVM) to predict the probability of each pixel of being foreground (i.e., vein pixel) given several sequences centered on it. Thirdly, we propose a supervised encoding scheme to extract the binary vein texture. A threshold is automatically computed by taking into account the maximal separation between the inter-class distance and the intra-class distance. In our approach, the CNN learns robust features for vein texture pattern representation and LSTM stores the complex spatial dependencies of vein patterns. So, the pixels in any region of a test image can then be classified effectively. In addition, the supervised information is employed to encode the vein patterns, so the resulting encoding images contain more discriminating features. The experimental results on one public finger-vein database show that the proposed approach significantly improves the finger-vein verification accuracy.

Stacked Convolutional Neural Networks Research Articles

Related Topics

Articles published on Stacked Convolutional Neural Networks

Reconstructing Reflection Maps Using a Stacked-CNN for Mixed Reality Rendering.

A multi-objective learning speech enhancement algorithm based on IRM post-processing with joint estimation of SCNN and TCNN

Learning Attention Representation with a Multi-Scale CNN for Gear Fault Diagnosis under Different Working Conditions.

Visual Loop Closure Detection Based on Stacked Convolutional and Autoencoder Neural Networks

Finger-Vein Verification Based on LSTM Recurrent Neural Networks

An Ensemble Stacked Convolutional Neural Network Model for Environmental Event Sound Recognition

Context-aware stacked convolutional neural networks for classification of breast carcinomas in whole-slide histopathology images.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Stacked Convolutional Neural Networks Research Articles

Related Topics

Articles published on Stacked Convolutional Neural Networks

Reconstructing Reflection Maps Using a Stacked-CNN for Mixed Reality Rendering.

A multi-objective learning speech enhancement algorithm based on IRM post-processing with joint estimation of SCNN and TCNN

Learning Attention Representation with a Multi-Scale CNN for Gear Fault Diagnosis under Different Working Conditions.

Visual Loop Closure Detection Based on Stacked Convolutional and Autoencoder Neural Networks

Finger-Vein Verification Based on LSTM Recurrent Neural Networks

An Ensemble Stacked Convolutional Neural Network Model for Environmental Event Sound Recognition

Context-aware stacked convolutional neural networks for classification of breast carcinomas in whole-slide histopathology images.