Multi-stream Network Research Articles

In recent years, detection techniques using computer vision and deep learning have shown promise in assessing driver distraction. This paper proposes a fusion network that combines a Spatial-Temporal Graph Convolutional Network (ST-GCN) and a hybrid convolutional network to integrate multimodal input data for recognizing distracted driver behavior. Specifically, to address the limitations of the ST-GCN method in modeling long-distance joint interaction features and inadequate temporal feature extraction, we design the Spatially Symmetric Configuration Partitioning Graph Convolutional Network (SSCP-GCN) to model relative motion information of symmetric relationships between limbs. Specifically, we utilize densely connected blocks for processing multi-scale temporal information between consecutive frames, thereby enhancing the reuse of bottom features. Furthermore, the expression of important temporal information is augmented by the introduction of the channel attention mechanism. To tackle the problem that the Mixed Convolution (MC) combining 3D convolution with 2D convolution cannot extract higher-order timing information and has limitations in modeling global dependency relationships, we compensate for its inability to capture higher-order temporal semantic information using the Time Shift Module (TSM) without consuming additional computational resources. Additionally, the 3D Multi-Head Self-Attention mechanism (3D MHSA) is employed to integrate global spatial–temporal information of high-level features, avoiding the issue of model complexity proliferation caused by the deep stacking design of Convolutional Neural Networks (CNN). Lastly, we introduce a multistream network framework that integrates driver posture and appearance features to harness complementary advantages, enabling us to combine multimodal input features to achieve better model performance. Experimental results indicate that the accuracy of the network designed in this paper reaches 95.6% and 94.3% on ASU dataset and NTU-RGB+D dataset, respectively. The small size of the model offers the possibility for practical application of the algorithm.

Read full abstract

Deep learning (DL) has shown promising results in molecular-based classification of glioma subtypes from MR images. DL requires a large number of training data for achieving good generalization performance. Since brain tumor datasets are usually small in size, combination of such datasets from different hospitals are needed. Data privacy issue from hospitals often poses a constraint on such a practice. Federated learning (FL) has gained much attention lately as it trains a central DL model without requiring data sharing from different hospitals. We propose a novel 3D FL scheme for glioma and its molecular subtype classification. In the scheme, a slice-based DL classifier, EtFedDyn, is exploited which is an extension of FedDyn, with the key differences on using focal loss cost function to tackle severe class imbalances in the datasets, and on multi-stream network to exploit MRIs in different modalities. By combining EtFedDyn with domain mapping as the pre-processing and 3D scan-based post-processing, the proposed scheme makes 3D brain scan-based classification on datasets from different dataset owners. To examine whether the FL scheme could replace the central learning (CL) one, we then compare the classification performance between the proposed FL and the corresponding CL schemes. Furthermore, detailed empirical-based analysis were also conducted to exam the effect of using domain mapping, 3D scan-based post-processing, different cost functions and different FL schemes. Experiments were done on two case studies: classification of glioma subtypes (IDH mutation and wild-type on TCGA and US datasets in case A) and glioma grades (high/low grade glioma HGG and LGG on MICCAI dataset in case B). The proposed FL scheme has obtained good performance on the test sets (85.46%, 75.56%) for IDH subtypes and (89.28%, 90.72%) for glioma LGG/HGG all averaged on five runs. Comparing with the corresponding CL scheme, the drop in test accuracy from the proposed FL scheme is small (-1.17%, -0.83%), indicating its good potential to replace the CL scheme. Furthermore, the empirically tests have shown that an increased classification test accuracy by applying: domain mapping (0.4%, 1.85%) in case A; focal loss function (1.66%, 3.25%) in case A and (1.19%, 1.85%) in case B; 3D post-processing (2.11%, 2.23%) in case A and (1.81%, 2.39%) in case B and EtFedDyn over FedAvg classifier (1.05%, 1.55%) in case A and (1.23%, 1.81%) in case B with fast convergence, which all contributed to the improvement of overall performance in the proposed FL scheme. The proposed FL scheme is shown to be effective in predicting glioma and its subtypes by using MR images from test sets, with great potential of replacing the conventional CL approaches for training deep networks. This could help hospitals to maintain their data privacy, while using a federated trained classifier with nearly similar performance as that from a centrally trained one. Further detailed experiments have shown that different parts in the proposed 3D FL scheme, such as domain mapping (make datasets more uniform) and post-processing (scan-based classification), are essential.

Read full abstract

Multi-stream Network Research Articles

Related Topics

Articles published on Multi-stream Network

Appearance-posture fusion network for distracted driving behavior recognition

Multi-stream network with key frame sampling for human action recognition

A multi-stream network for retrosynthesis prediction

Graph-Based Progressive Fusion Network for Multi-Modality Vehicle Re-Identification

FPRnet: A lightweight multi-domain multi-stream network for complex horizontal oil-water two-phase flow pattern recognition

Progressive Moire Removal and Texture Complementation for Image Demoireing

A novel federated deep learning scheme for glioma and its subtype classification.

Skeleton-Based Multifeatures and Multistream Network for Real-Time Action Recognition

Pose-Driven Realistic 2-D Motion Synthesis.

Automatic translation of sign language with multi-stream 3D CNN and generation of artificial depth maps

DISNet: A sequential learning framework to handle occlusion in human action recognition with video acquisition sensors

A Multi-Stream Sequence Learning Framework for Human Interaction Recognition

End-to-end driving model based on deep learning and attention mechanism

Pose-Guided Graph Convolutional Networks for Skeleton-Based Action Recognition

Multi-Stream Deep Neural Network for Diabetic Retinopathy Severity Classification Under a Boosting Framework

Hidden Markov Model-Based Video Recognition for Sports

Multi‐stream densely connected network for semantic segmentation

A Lightweight Hierarchical Model with Frame-Level Joints Adaptive Graph Convolution for Skeleton-Based Action Recognition

Monocular Depth Estimation using Integrated Model with Multi-task Learning and Multi-stream

Omnidirectional Image Quality Assessment by Distortion Discrimination Assisted Multi-Stream Network

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Multi-stream Network Research Articles

Related Topics

Articles published on Multi-stream Network

Appearance-posture fusion network for distracted driving behavior recognition

Multi-stream network with key frame sampling for human action recognition

A multi-stream network for retrosynthesis prediction

Graph-Based Progressive Fusion Network for Multi-Modality Vehicle Re-Identification

FPRnet: A lightweight multi-domain multi-stream network for complex horizontal oil-water two-phase flow pattern recognition

Progressive Moire Removal and Texture Complementation for Image Demoireing

A novel federated deep learning scheme for glioma and its subtype classification.

Skeleton-Based Multifeatures and Multistream Network for Real-Time Action Recognition

Pose-Driven Realistic 2-D Motion Synthesis.

Automatic translation of sign language with multi-stream 3D CNN and generation of artificial depth maps

DISNet: A sequential learning framework to handle occlusion in human action recognition with video acquisition sensors

A Multi-Stream Sequence Learning Framework for Human Interaction Recognition

End-to-end driving model based on deep learning and attention mechanism

Pose-Guided Graph Convolutional Networks for Skeleton-Based Action Recognition

Multi-Stream Deep Neural Network for Diabetic Retinopathy Severity Classification Under a Boosting Framework

Hidden Markov Model-Based Video Recognition for Sports

Multi‐stream densely connected network for semantic segmentation

A Lightweight Hierarchical Model with Frame-Level Joints Adaptive Graph Convolution for Skeleton-Based Action Recognition

Monocular Depth Estimation using Integrated Model with Multi-task Learning and Multi-stream

Omnidirectional Image Quality Assessment by Distortion Discrimination Assisted Multi-Stream Network