Tree species classification is an important and challenging task in image recognition and the management of forest resources. Moreover, the task of tree species classification based on remote sensing images can significantly improve the efficiency of the tree species survey and save costs. In recent years, many large models have achieved high accuracy in the task of tree species classification in an airborne remote-sensing manner, but due to their fixed geometric structure, traditional convolutional neural networks are inherently limited to the local receptive field and can only provide segmental context information. The limitation of insufficient context information greatly affects the segmentation accuracy. In this paper, a dual-attention residual network (AMDNet) and a re-parameterized model approach are proposed to capture the global context information, fuse the weight, reduce the model volume, and maintain the computational efficiency. Firstly, we propose MobileNetV2 as the backbone network for feature extraction to further improve the feature identification by modeling semantic dependencies in the spatial dimension and channel dimension and adding the output of the two attention modules. Then, the attention perception features are generated by stacking the attention modules, and the in-depth residual attention network is trained using attention residual learning, through which more accurate segmentation results can be obtained. Secondly, we adopt the approach of structure re-parameterization, use a multi-branch topology for training, carry out weighted averaging on multiple trained models, and fuse multiple branch modules into a completely equivalent module in inference. The proposed approach results in a reduction in the number of parameters and an accelerated inference speed while also achieving improved classification accuracy. In addition, the model training strategy is optimized based on Transformer to enhance the accuracy of segmentation. The model was used to conduct classification experiments on aerial orthophotos of Hongya Forest Farm in Sichuan, China, and the MIOU of tree species recognition using the test equipment reached 93.8%. Compared with current models such as UNet, our model exhibits a better performance in terms of both speed and accuracy, in addition to its enhanced deployment capacity, and its speed advantage is more conducive to real-time segmentation, thereby representing a novel approach for the classification of tree species in remote sensing imagery with significant potential for practical applications.