Abstract

The detection of Rail Surface Defects (RSDs) plays a critical role in railway track maintenance. Traditional image processing methods exhibit limitations due to their intricate design and insufficient robustness, thereby restricting their broader applications. Recently, deep learning-based RSD detection methods have drawn great attention. However, these methods predominantly rely on Convolutional Neural Networks (CNN), neglecting the hierarchical linkages amongst disparate features, which impedes the refined portrayal of RSDs. To address these issues, we propose RailFormer, a novel system leveraging the capabilities of Transformer-based networks for the precise and efficient detection of RSDs. The encoder in RailFormer incorporates overlapped patch merging, efficient self-attention, and a Mix-feed Forward Network (FFN), all meticulously designed to bolster feature fusion from both global and local perspectives. Additionally, we have implemented a Criss-Cross attention module within the decoder to facilitate RSD detection and manage computational complexity. In this study, the proposed RailFormer and four other models including SegFormer, Swin Transformer, ViT, and UNet are trained and compared. We employ the widely used public RSD datasets RSDD, encompassing both Type-I and Type-II RSDD images and a customized RSD dataset, as a basis for performance comparison. The training outcomes and visualization results show that RailFormer achieves the highest mean Intersection over Union (mIoU) and superior visualization performance on the RSDD and the customized RSD datasets. These results demonstrate the superiority of RailFormer and underline its potential for future deployment in railway track inspection applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call