Abstract

AbstractWhen performing visual inspections of bridges, experts collect photographs of defects to assess the overall condition of the structure and schedule maintenance plans. Such inspections are labor‐intensive, and computer vision‐based systems are being investigated as automated tools to assist the experts in their inspections. An important aspect however remains to ensure the representativeness of the data accounting for the sheer size, complexity and variety of the bridge components and defects being reported. Here, we perform a multi‐label classification on a dataset (SOFIA dataset) that consists of 139,455 images of types of bridge components and defects among which 53,805 are labeled (13 classes for each type). The dataset containing class imbalance and noisy labeling is processed using visual embedding computed from unsupervised deep learning methods. A combination of class‐balancing techniques is investigated on the state‐of‐the‐art Vision Transformer model. Interclass relations, which determine whether a class of defect should be part of a class of bridge component, are implemented with an additional filtering step. The whole method is also deployed on the CODEBRIM benchmark dataset resulting in an improved accuracy score.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call