Abstract

Glass is a common object in living environments, but detecting it can be difficult because of the reflection and refraction of various colors of light in different environments; even humans are sometimes unable to detect glass. Currently, many methods are used to detect glass, but most rely on other sensors, which are costly and have difficulty collecting data. This study aims to solve the problem of detecting glass regions in a single RGB image by concatenating contextual features from multiple receptive fields and proposing a new enhanced feature fusion algorithm. To do this, we first construct a contextual attention module to extract backbone features through a self-attention approach. We then propose a VIT-based deep semantic segmentation architecture called MFT, which associates multilevel receptive field features and retains the feature information captured by each level of features. It is shown experimentally that our proposed method performs better on existing glass detection datasets than several state-of-the-art glass detection and transparent object detection methods, which fully demonstrates the better performance of our TGSNet.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call