Cityscapes Dataset Research Articles

Semantic segmentation often suffers from class imbalance, where the label ratio for each class in the dataset is not uniform. Recent studies have addressed the issue of class imbalance in semantic segmentation by leveraging the neural collapse phenomenon in conjunction with an Equiangular Tight Frame (ETF). While the use of ETF aids in enhancing the discriminability of minor classes, class correlation is another crucial factor that must be taken into account. However, managing the balance between class correlation and discrimination through neural collapse remains challenging, as these properties inherently conflict with one another. Moreover, this control is established during the training stage, resulting in a fixed classifier. There is no guarantee that this classifier will consistently perform well with different input images. To address this problem, we propose an Equiangular Tight Frame Transformer (ETFT), a transformer-based model that jointly processes the features and classifier using ETF structure, and dynamically generates the classifier as a function of the input for imbalanced semantic segmentation. Specifically, the classifier initialized with the ETF structure is jointly processed with the input patch tokens during the attention process. As a result, the transformed patch tokens, aided by the ETF structure, achieve discriminability between classes while preserving contextual correlation. The classifier, initially structured as an ETF, is adjusted to incorporate the correlation information, benefiting from the attention mechanism. Furthermore, the learned classifier is combined with the fixed ETF classifier, leveraging the advantages of both. Extensive experiments demonstrate that the proposed method outperforms state-of-the-art methods for imbalanced semantic segmentation on both the ADE20K and Cityscapes datasets.

Semantic segmentation of urban street scenes has attracted much attention in the field of autonomous driving, which not only helps vehicles perceive the environment in real time, but also significantly improves the decision-making ability of autonomous driving systems. However, most of the current methods based on Convolutional Neural Network (CNN) mainly use coding the input image to a low resolution and then try to recover the high resolution, which leads to problems such as loss of spatial information, accumulation of errors, and difficulty in dealing with large-scale changes. To address these problems, in this paper, we propose a new semantic segmentation network (HRDLNet) for urban street scene images with high-resolution representation, which improves the accuracy of segmentation by always maintaining a high-resolution representation of the image. Specifically, we propose a feature extraction module (FHR) with high-resolution representation, which efficiently handles multi-scale targets and high-resolution image information by efficiently fusing high-resolution information and multi-scale features. Secondly, we design a multi-scale feature extraction enhancement (MFE) module, which significantly expands the sensory field of the network, thus enhancing the ability to capture correlations between image details and global contextual information. In addition, we introduce a dual-attention mechanism module (CSD), which dynamically adjusts the network to more accurately capture subtle features and rich semantic information in images. We trained and evaluated HRDLNet on the Cityscapes Dataset and the PASCAL VOC 2012 Augmented Dataset, and verified the model’s excellent performance in the field of urban streetscape image segmentation. The unique advantages of our proposed HRDLNet in the field of semantic segmentation of urban streetscapes are also verified by comparing it with the state-of-the-art methods.

Cityscapes Dataset Research Articles

Related Topics

Articles published on Cityscapes Dataset

ETFT: Equiangular Tight Frame Transformer for Imbalanced Semantic Segmentation

Cross-domain autonomous driving visual segmentation based on enhanced target data learning

Road surface semantic segmentation for autonomous driving

MEDANet: More Efficient Dual Attention Network for Scene Segmentation

Enhancing the utilization of uncertain pixels in semi-supervised semantic segmentation

Lightweight multi-scale feature dense cascade neural network for scene understanding of intelligent autonomous platform

Style Optimization Networks for real-time semantic segmentation of rainy and foggy weather

An Active Learning Semantic Segmentation Model Based on an Improved Double Deep Q-Network

Generative Denoise Distillation: Simple stochastic noises induce efficient knowledge transfer for dense prediction

HRDLNet: a semantic segmentation network with high resolution representation for urban street view images

Attention-based fusion network for RGB-D semantic segmentation

Improving real-time object detection in Internet-of-Things smart city traffic with YOLOv8-DSAF method

DECNet: Dense embedding contrast for unsupervised semantic segmentation

Feature boosting with efficient attention for scene parsing

GC-YOLOv9: Innovative smart city traffic monitoring solution

Road Anomaly Detection with Unknown Scenes Using DifferNet-Based Automatic Labeling Segmentation

Evaluating the Effectiveness of Panoptic Segmentation Through Comparative Analysis

LACTNet: A Lightweight Real-Time Semantic Segmentation Network Based on an Aggregated Convolutional Neural Network and Transformer

Depth-Aware Panoptic Segmentation

IIMT-net: Poly-1 weights balanced multi-task network for semantic segmentation and depth estimation using interactive information

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Cityscapes Dataset Research Articles

Related Topics

Articles published on Cityscapes Dataset

ETFT: Equiangular Tight Frame Transformer for Imbalanced Semantic Segmentation

Cross-domain autonomous driving visual segmentation based on enhanced target data learning

Road surface semantic segmentation for autonomous driving

MEDANet: More Efficient Dual Attention Network for Scene Segmentation

Enhancing the utilization of uncertain pixels in semi-supervised semantic segmentation

Lightweight multi-scale feature dense cascade neural network for scene understanding of intelligent autonomous platform

Style Optimization Networks for real-time semantic segmentation of rainy and foggy weather

An Active Learning Semantic Segmentation Model Based on an Improved Double Deep Q-Network

Generative Denoise Distillation: Simple stochastic noises induce efficient knowledge transfer for dense prediction

HRDLNet: a semantic segmentation network with high resolution representation for urban street view images

Attention-based fusion network for RGB-D semantic segmentation

Improving real-time object detection in Internet-of-Things smart city traffic with YOLOv8-DSAF method

DECNet: Dense embedding contrast for unsupervised semantic segmentation

Feature boosting with efficient attention for scene parsing

GC-YOLOv9: Innovative smart city traffic monitoring solution

Road Anomaly Detection with Unknown Scenes Using DifferNet-Based Automatic Labeling Segmentation

Evaluating the Effectiveness of Panoptic Segmentation Through Comparative Analysis

LACTNet: A Lightweight Real-Time Semantic Segmentation Network Based on an Aggregated Convolutional Neural Network and Transformer

Depth-Aware Panoptic Segmentation

IIMT-net: Poly-1 weights balanced multi-task network for semantic segmentation and depth estimation using interactive information