DRIVE Dataset Research Articles

Driver monitoring systems (DMS) are crucial in autonomous driving systems (ADS) when users are concerned about driver/vehicle safety. In DMS, the significant influencing factor of driver/vehicle safety is the classification of driver distractions or activities. The driver’s distractions or activities convey meaningful information to the ADS, enhancing the driver/ vehicle safety in real-time vehicle driving. The classification of driver distraction or activity is challenging due to the unpredictable nature of human driving. This paper proposes a convolutional block attention module embedded in Visual Geometry Group (CBAM VGG16) deep learning architecture to improve the classification performance of driver distractions. The proposed CBAM VGG16 architecture is the hybrid network of the CBAM layer with conventional VGG16 network layers. Adding a CBAM layer into a traditional VGG16 architecture enhances the model’s feature extraction capacity and improves the driver distraction classification results. To validate the significant performance of our proposed CBAM VGG16 architecture, we tested our model on the American University in Cairo (AUC) distracted driver dataset version 2 (AUCD2) for cameras 1 and 2 images. Our experiment results show that the proposed CBAM VGG16 architecture achieved 98.65% classification accuracy for camera 1 and 97.85% for camera 2 AUCD2 datasets. The CBAM VGG16 architecture also compared the driver distraction classification performance with DenseNet121, Xception, MoblieNetV2, InceptionV3, and VGG16 architectures based on the proposed model’s accuracy, loss, precision, F1 score, recall, and confusion matrix. The drivers’ distraction classification results indicate that the proposed CBAM VGG16 has 3.7% classification improvements for AUCD2 camera 1 images and 5% for camera 2 images compared to the conventional VGG16 deep learning classification model. We also tested our proposed architecture with different hyperparameter values and estimated the optimal values for best driver distraction classification. The significance of data augmentation techniques for the data diversity performance of the CBAM VGG16 model is also validated in terms of overfitting scenarios. The Grad-CAM visualization of our proposed CBAM VGG16 architecture is also considered in our study, and the results show that VGG16 architecture without CBAM layers is less attentive to the essential parts of the driver distraction images. Furthermore, we tested the effective classification performance of our proposed CBAM VGG16 architecture with the number of model parameters, model size, various input image resolutions, cross-validation, Bayesian search optimization and different CBAM layers. The results indicate that CBAM layers in our proposed architecture enhance the classification performance of conventional VGG16 architecture and outperform the state-of-the-art deep learning architectures.

Read full abstract

Background and objectiveComputer-based biomedical image segmentation plays a crucial role in planning of assisted diagnostics and therapy. However, due to the variable size and irregular shape of the segmentation target, it is still a challenge to construct an effective medical image segmentation structure. Recently, hybrid architectures based on convolutional neural networks (CNNs) and transformers were proposed. However, most current backbones directly replace one or all convolutional layers with transformer blocks, regardless of the semantic gap between features. Thus, how to sufficiently and effectively eliminate the semantic gap as well as combine the global and local information is a critical challenge. MethodsTo address the challenge, we propose a novel structure, called BiU-Net, which integrates CNNs and transformers with a two-stage fusion strategy. In the first fusion stage, called Single-Scale Fusion (SSF) stage, the encoding layers of the CNNs and transformers are coupled, with both having the same feature map size. The SSF stage aims to reconstruct local features based on CNNs and long-range information based on transformers in each encoding block. In the second stage, Multi-Scale Fusion (MSF), BiU-Net interacts with multi-scale features from various encoding layers to eliminate the semantic gap between deep and shallow layers. Furthermore, a Context-Aware Block (CAB) is embedded in the bottleneck to reinforce multi-scale features in the decoder. ResultsExperiments on four public datasets were conducted. On the BUSI dataset, our BiU-Net achieved 85.50 % on Dice coefficient (Dice), 76.73 % on intersection over union (IoU), and 97.23 % on accuracy (ACC). Compared to the state-of-the-art method, BiU-Net improves Dice by 1.17 %. For the Monuseg dataset, the proposed method attained the highest scores, reaching 80.27 % and 67.22 % for Dice and IoU. The BiU-Net achieves 95.33 % and 81.22 % Dice on the PH2 and DRIVE datasets. ConclusionsThe results of our experiments showed that BiU-Net transcends existing state-of-the-art methods on four publicly available biomedical datasets. Due to the powerful multi-scale feature extraction ability, our proposed BiU-Net is a versatile medical image segmentation framework for various types of medical images. The source code is released on (https://github.com/ZYLandy/BiU-Net).

Read full abstract

DRIVE Dataset Research Articles

Related Topics

Articles published on DRIVE Dataset

Retinal artery/vein vessel segmentation and measurement for hypertension based on multi-level edge-guided network

TCDDU-Net: combining transformer and convolutional dual-path decoding U-Net for retinal vessel segmentation

Detection of optic disc in human retinal images utilizing the Bitterling Fish Optimization (BFO) algorithm

A multi-scale feature extraction and fusion-based model for retinal vessel segmentation in fundus images.

Diabetic retinopathy data augmentation and vessel segmentation through deep learning based three fully convolution neural networks

Thin vessel segmentation in fundus images using attention UNet and modified Frangi filtering

Spatial attention U-Net model with Harris hawks optimization for retinal blood vessel and optic disc segmentation in fundus images.

Assessment of retinal blood vessel segmentation using U-Net model: A deep learning approach

GKE-TUNet: Geometry-Knowledge Embedded TransUNet Model for Retinal Vessel Segmentation Considering Anatomical Topology.

Dual-branch Transformer for semi-supervised medical image segmentation.

Physics-informed deep generative learning for quantitative assessment of the retina

Retina Blood Vessels Segmentation and Classification with the Multi-featured Approach.

DEAF-Net: Detail-Enhanced Attention Feature Fusion Network for Retinal Vessel Segmentation.

CBAM VGG16: An efficient driver distraction classification using CBAM embedded VGG16 architecture

A Novel Single-Sample Retinal Vessel Segmentation Method Based on Grey Relational Analysis.

TLTNet: A novel transscale cascade layered transformer network for enhanced retinal blood vessel segmentation

RAGE-Net: Enhanced retinal vessel segmentation U-shaped network using Gabor convolution

BiU-net: A dual-branch structure based on two-stage fusion strategy for biomedical image segmentation

TD Swin-UNet: Texture-Driven Swin-UNet with Enhanced Boundary-Wise Perception for Retinal Vessel Segmentation.

PSO-HRVSO: Segmentation of Retinal Vessels Through Homomorphic Filtering Enhanced by PSO Optimization

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

DRIVE Dataset Research Articles

Related Topics

Articles published on DRIVE Dataset

Retinal artery/vein vessel segmentation and measurement for hypertension based on multi-level edge-guided network

TCDDU-Net: combining transformer and convolutional dual-path decoding U-Net for retinal vessel segmentation

Detection of optic disc in human retinal images utilizing the Bitterling Fish Optimization (BFO) algorithm

A multi-scale feature extraction and fusion-based model for retinal vessel segmentation in fundus images.

Diabetic retinopathy data augmentation and vessel segmentation through deep learning based three fully convolution neural networks

Thin vessel segmentation in fundus images using attention UNet and modified Frangi filtering

Spatial attention U-Net model with Harris hawks optimization for retinal blood vessel and optic disc segmentation in fundus images.

Assessment of retinal blood vessel segmentation using U-Net model: A deep learning approach

GKE-TUNet: Geometry-Knowledge Embedded TransUNet Model for Retinal Vessel Segmentation Considering Anatomical Topology.

Dual-branch Transformer for semi-supervised medical image segmentation.

Physics-informed deep generative learning for quantitative assessment of the retina

Retina Blood Vessels Segmentation and Classification with the Multi-featured Approach.

DEAF-Net: Detail-Enhanced Attention Feature Fusion Network for Retinal Vessel Segmentation.

CBAM VGG16: An efficient driver distraction classification using CBAM embedded VGG16 architecture

A Novel Single-Sample Retinal Vessel Segmentation Method Based on Grey Relational Analysis.

TLTNet: A novel transscale cascade layered transformer network for enhanced retinal blood vessel segmentation

RAGE-Net: Enhanced retinal vessel segmentation U-shaped network using Gabor convolution

BiU-net: A dual-branch structure based on two-stage fusion strategy for biomedical image segmentation

TD Swin-UNet: Texture-Driven Swin-UNet with Enhanced Boundary-Wise Perception for Retinal Vessel Segmentation.

PSO-HRVSO: Segmentation of Retinal Vessels Through Homomorphic Filtering Enhanced by PSO Optimization