Weighted Score Fusion Research Articles

The application of voice recognition systems has increased by a great deal with technology. This has allowed adversaries to falsely claim access to these systems by spoofing the identity of a target speaker. The existing supervised learning (SL)-based countermeasures are yet to provide a complete solution against the newly evolving spoofing attacks. To tackle this problem, we explore self-supervised learning (SSL)-based frameworks. At first, we implement widely used SSL frameworks, where our target is identifying spoofed speech. We report a considerable performance improvement over the SL state-of-the-art baseline as a whole. Then, we perform an attack-wise comparative analysis between SL and SSL frameworks. While the SSL performs better in most cases, there are certain attacks where the SL outperforms it. Hence, we hypothesize that there is scope to jointly utilize information effectively included by both these models for better performance. To do that, we first perform conventional weighted score fusion between the SL and best-performing SSL models, which reduces the EER, outperforming both the state-of-the-art SL and best-performing SSL framework. Then, we propose an embedding fusion scheme that minimizes the embedding distribution between the selected SL and SSL representations. To select the appropriate layers, we perform a comprehensive statistical analysis. The proposed fusion scheme outperforms the score fusion method and shows that the SSL performance can be improved by effectively including learned knowledge from the SL framework. The final EER achieved on the ASVspoof 2019 logical access (LA) database is 0.177%, a significant improvement over our baseline. Using the ASVspoof 2021 LA as a blind evaluation dataset, our proposed embedding fusion scheme reduces the EER to 2.666%.

Fall is the biggest threat to seniors, with significant emotional, physical and financial implications. It is the major cause of serious injuries, disabilities, hospitalizations and even death especially for elderly people living alone. Timely detection could provide immediate medical service to the injured and avoid its harmful consequences. Great number of vision-based techniques has been proposed by installing cameras in several everyday environments. Recently, deep learning has revolutionized these techniques, mostly using convolutional neural networks (CNNs). In this paper, we propose weighted multi-stream deep convolutional neural networks that exploit the rich multimodal data provided by RGB-D cameras. Our method detects automatically fall events and sends a help request to the caregivers. Our contribution is three-fold. We build a new architecture composed of four separate CNN streams, one for each modality. The first modality is based on a single combined RGB and depth image to encode static appearance information. RGB image is used to capture color and texture and depth image deals with illumination variations. In contrast of the first feature that lacks the contextual information about previous and next frames, the second modality characterizes the human shape variations. After background subtraction and person recognition, human silhouette is extracted and stacked to define history of binary motion HBMI. The last two modalities are used to more discriminate the motion information. Stacked amplitude and oriented flow are used in addition to stacked optical flow field to describe respectively the velocity, the direction and the motion displacements. The main motivation behind the use of these multimodal data is to combine complementary information such as motion, shape, RGB and depth appearance to achieve more accurate detection than using only one modality. Our second contribution is the combination of the four streams to generate the final decision for fall detection. We evaluate early and late fusion strategies and we have defined the weight of each modality based on its overall system performance. Weighted score fusion is finally adopted based on our experiments. In the third contribution, transfer learning and data augmentation are applied to increase the amount of training data, avoid over fitting and improve the accuracy. Experiments have been conducted on publicly available standard datasets and demonstrate the effectiveness of the proposed method compared to existing methods.

Weighted Score Fusion Research Articles

Related Topics

Articles published on Weighted Score Fusion

Improving self-supervised learning model for audio spoofing detection with layer-conditioned embedding fusion

Weighted Score Fusion Based LSTM Model for High-Speed Railway Propagation Scenario Identification

What happens next? Combining enhanced multilevel script learning and dual fusion strategies for script event prediction

The severity prediction of the binary and multi-class cardiovascular disease − A machine learning-based fusion approach

End-to-End Modeling and Transfer Learning for Audiovisual Emotion Recognition in-the-Wild

Score Level Fusion for Iris and Periocular Biometrics Recogniton Based on Deep Learning

Multi-feature fusion for high-speed railway propagation scene recognition based on LSTM networks

Multi-Feature Fusion Based Recognition and Relevance Analysis of Propagation Scenes for High-Speed Railway Channels

Elderly fall detection based on multi-stream deep convolutional networks

Singular value decomposition-based virtual representation for face recognition

An Improved Two-Step Face Recognition Algorithm Based on Sparse Representation

Singular value decomposition based sample diversity and adaptive weighted fusion for face recognition

Adaptive weighted fusion: A novel fusion approach for image classification

Face Recognition Using 2-D, 3-D, and Infrared: Is Multimodal Better Than Multisample?

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Weighted Score Fusion Research Articles

Related Topics

Articles published on Weighted Score Fusion

Improving self-supervised learning model for audio spoofing detection with layer-conditioned embedding fusion

Weighted Score Fusion Based LSTM Model for High-Speed Railway Propagation Scenario Identification

What happens next? Combining enhanced multilevel script learning and dual fusion strategies for script event prediction

The severity prediction of the binary and multi-class cardiovascular disease − A machine learning-based fusion approach

End-to-End Modeling and Transfer Learning for Audiovisual Emotion Recognition in-the-Wild

Score Level Fusion for Iris and Periocular Biometrics Recogniton Based on Deep Learning

Multi-feature fusion for high-speed railway propagation scene recognition based on LSTM networks

Multi-Feature Fusion Based Recognition and Relevance Analysis of Propagation Scenes for High-Speed Railway Channels

Elderly fall detection based on multi-stream deep convolutional networks

Singular value decomposition-based virtual representation for face recognition

An Improved Two-Step Face Recognition Algorithm Based on Sparse Representation

Singular value decomposition based sample diversity and adaptive weighted fusion for face recognition

Adaptive weighted fusion: A novel fusion approach for image classification

Face Recognition Using 2-D, 3-D, and Infrared: Is Multimodal Better Than Multisample?