Abstract
Vision transformers (ViTs) are rapidly evolving and are widely used in computer vision. However, high-performance ViTs require many computations, which limit their further development in the vision field. In this article, a novel evolutionary dual-stream transformer (E-DST) model is proposed to alleviate the computational resource demand problem. A hybrid attention mechanism structure is proposed for a DST model. The DST model uses a dual-branch structure to fuse convolutional and transformer features. Combining the features learned by the transformer and convolution effectively saves model computational resources. In addition, an evolutionary optimizer is proposed to optimize the parameters of the model. The excellent search ability of the evolutionary algorithm is utilized to optimize the transformer model parameters. The convergence of the evolutionary optimizer is proved in this article. In addition, the proposed E-DST model is experimentally compared with a variety of classic models and their deformations based on three datasets. And, the evolutionary optimizer proves its generality in convolutional and recurrent neural networks. The experimental results show that the E-DST model can effectively reduce computational resources and that the evolutionary optimizer can solve large-scale optimization problems. In conclusion, our proposed method is feasible and effective.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.