CamVid Dataset Research Articles

In computer vision, the task of semantic segmentation is crucial for applications such as autonomous driving and intelligent surveillance. However, achieving a balance between real-time performance and segmentation accuracy remains a significant challenge. Although Fast-SCNN is favored for its efficiency and low computational complexity, it still faces difficulties when handling complex street scene images. To address this issue, this paper presents an improved Fast-SCNN, aiming to enhance the accuracy and efficiency of semantic segmentation by incorporating a novel attention mechanism and an enhanced feature extraction module. Firstly, the integrated SimAM (Simple, Parameter-Free Attention Module) increases the network’s sensitivity to critical regions of the image and effectively adjusts the feature space weights across channels. Additionally, the refined pyramid pooling module in the global feature extraction module captures a broader range of contextual information through refined pooling levels. During the feature fusion stage, the introduction of an enhanced DAB (Depthwise Asymmetric Bottleneck) block and SE (Squeeze-and-Excitation) attention optimizes the network’s ability to process multi-scale information. Furthermore, the classifier module is extended by incorporating deeper convolutions and more complex convolutional structures, leading to a further improvement in model performance. These enhancements significantly improve the model’s ability to capture details and overall segmentation performance. Experimental results demonstrate that the proposed method excels in processing complex street scene images, achieving a mean Intersection over Union (mIoU) of 71.7% and 69.4% on the Cityscapes and CamVid datasets, respectively, while maintaining inference speeds of 81.4 fps and 113.6 fps. These results indicate that the proposed model effectively improves segmentation quality in complex street scenes while ensuring real-time processing capabilities.

Recently, new advances in deep learning algorithms have yielded some fascinating results in the field of computer vision technology. As a result, it can now perform activities that formerly required the use of human vision and the brain. Classification, object identification, and semantic segmentation have all seen substantial advancements in deep learning architecture in the last few years. For still images and movies, there has been a major advancement in the field of semantic segmentation. In practical uses like autonomous vehicles, segmenting semantic video continues to be difficult due to high-performance standards, the high cost of convolutional neural networks (CNNs), and the significant need for low latency. An effective machine-learning environment will be developed to meet the performance and latency challenges outlined above. The use of deep learning architectures like SegNet and FlowNet2.0 on the CamVid dataset enables this environment to conduct pixel-wise semantic segmentation of video properties while maintaining low latency. As a result, it is ideally suited for real-world applications since it takes advantage of both SegNet and FlowNet topologies. The decision network determines whether an image frame should be processed by a segmentation network or an optical flow network based on the expected confidence score. In conjunction with adaptive scheduling of the key frame approach, this technique for decision-making can help to speed up the procedure. Using the ResNet50 SegNet model, a mean Intersection on Union (IoU) of "54.27 percent" and an average frame per second of "19.57" were observed. Aside from decision network and adaptive key frame sequencing, it was discovered that FlowNet2.0 increased the frames processed per second9(fps) to "30.19" on GPU with a mean IoU of "47.65%". Because the GPU was utilized "47.65%" of the time, this resulted. There has been an increase in the speed of the Video semantic segmentation network without sacrificing quality, as demonstrated by this improvement in performance.

CamVid Dataset Research Articles

Articles published on CamVid Dataset

Real-Time Semantic Segmentation Algorithm for Street Scenes Based on Attention Mechanism and Feature Fusion

Containment Control-Guided Boundary Information for Semantic Segmentation

BMSeNet: Multiscale Context Pyramid Pooling and Spatial Detail Enhancement Network for Real-Time Semantic Segmentation.

P2AT: Pyramid pooling axial transformer for real-time semantic segmentation

LACTNet: A Lightweight Real-Time Semantic Segmentation Network Based on an Aggregated Convolutional Neural Network and Transformer

Multi-scale full spike pattern for semantic segmentation

Fast-DSAGCN: Enhancing semantic segmentation with multifaceted attention mechanisms

SasWOT: Real-Time Semantic Segmentation Architecture Search WithOut Training

MCFNet: Multi-Attentional Class Feature Augmentation Network for Real-Time Scene Parsing

Improving Semantic Segmentation via Efficient Self-Training.

A new CNN-based semantic object segmentation for autonomous vehicles in urban traffic scenes

Attention based lightweight asymmetric network for real-time semantic segmentation

Video Semantic Segmentation Network with Low Latency Based on Deep Learning

Vehicle Detection and Speed Estimation Using Semantic Segmentation with Low Latency

Lightweight semantic segmentation network with configurable context and small object attention.

Lightweight multi-scale attention-guided network for real-time semantic segmentation

Block attention network: A lightweight deep network for real-time semantic segmentation of road scenes in resource-constrained devices

Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes.

LBARNet: Lightweight bilateral asymmetric residual network for real-time semantic segmentation

MFAFNet: A Lightweight and Efficient Network with Multi-Level Feature Adaptive Fusion for Real-Time Semantic Segmentation.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

CamVid Dataset Research Articles

Articles published on CamVid Dataset

Real-Time Semantic Segmentation Algorithm for Street Scenes Based on Attention Mechanism and Feature Fusion

Containment Control-Guided Boundary Information for Semantic Segmentation

BMSeNet: Multiscale Context Pyramid Pooling and Spatial Detail Enhancement Network for Real-Time Semantic Segmentation.

P2AT: Pyramid pooling axial transformer for real-time semantic segmentation

LACTNet: A Lightweight Real-Time Semantic Segmentation Network Based on an Aggregated Convolutional Neural Network and Transformer

Multi-scale full spike pattern for semantic segmentation

Fast-DSAGCN: Enhancing semantic segmentation with multifaceted attention mechanisms

SasWOT: Real-Time Semantic Segmentation Architecture Search WithOut Training

MCFNet: Multi-Attentional Class Feature Augmentation Network for Real-Time Scene Parsing

Improving Semantic Segmentation via Efficient Self-Training.

A new CNN-based semantic object segmentation for autonomous vehicles in urban traffic scenes

Attention based lightweight asymmetric network for real-time semantic segmentation

Video Semantic Segmentation Network with Low Latency Based on Deep Learning

Vehicle Detection and Speed Estimation Using Semantic Segmentation with Low Latency

Lightweight semantic segmentation network with configurable context and small object attention.

Lightweight multi-scale attention-guided network for real-time semantic segmentation

Block attention network: A lightweight deep network for real-time semantic segmentation of road scenes in resource-constrained devices

Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes.

LBARNet: Lightweight bilateral asymmetric residual network for real-time semantic segmentation

MFAFNet: A Lightweight and Efficient Network with Multi-Level Feature Adaptive Fusion for Real-Time Semantic Segmentation.