Abstract
Mineral recognition plays a pivotal role in advancing geological survey methodologies and exploration techniques, serving as a cornerstone of contemporary geoscience research. Recently, Transformer-based neural networks have outperformed ConvNets and have become increasingly prominent in vision models. However, adapting Transformer models to mineral photograph recognition presents two significant challenges. Firstly, mineral photograph recognition heavily relies on low-level features such as color, texture, and edges, which Transformers are not intrinsically optimized for. Secondly, the accurate recognition of small-scale objects within mineral images often poses difficulties. To tackle these challenges, we introduce the SwinMin model, specifically designed for mineral photograph recognition. This model incorporates convolutional information into Transformer sequences, thereby enriching the global representation with finer details. Furthermore, we propose a dynamic feature fusion module, which effectively exploits multi-scale contexts, ensuring a more comprehensive representation. Extensive experiments on the mineral photograph datasets demonstrated that SwinMin achieves state-of-the-art performance compared to existing mineral image recognition methods, underlining its potential for reliable and precise mineral image identification.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.