Abstract The accuracy and efficiency of crop distribution information extraction are pivotal in ensuring global food security. In long-time-series optical satellite data, most existing methods focus on extracting spatial features using Convolutional Neural Networks (CNNs), which do not adequately mine and model the spatial-temporal information. The development of the attention mechanism allows for the extraction of global features in remote-sensing images of long temporal sequences. To extract global attentional features with complementary features in crop remote sensing images, we propose a Global and Local Complementary Multi-path Feature Fusion Network (GLMP), which is capable of extracting global features from remote sensing images of long temporal sequence, that enhances the local characteristics of crop images derived from CNNs, thus obtaining more effective multi-scale complementary features. This extraction of features enhances the comprehension of crop images, thereby boosting the performance of associated tasks. Within GLMP, we introduce two pivotal modules: the Hybrid Attention and Convolutional Paths Module (HACM) and the Multi-path Feature Fusion Module (MPFM). These modules synergistically converge multi-path features, yielding more discriminative feature information. Experimental results on the ZueriCrop dataset show that the proposed GLMP technique is effective; it performs promisingly having a total accuracy of 90.2% and an F1 value of 62.5%. Furthermore, the ablation study verifies the substantial improvement in classification accuracy for remote sensing crop images of long-time series in nature, specifically attributed to the HACM and MPFM modules.