Micro-expression is a subtle facial movement that is fleeting and manifest in localized areas, making it difficult for the human eye to detect and recognize it. Although algorithms that extract facial features from specific regions or the entire face have shown potential, the classification of micro-expressions using features from symmetrical left and right regions can be challenging in the presence of unilateral movements. This can ultimately affect the performance of micro-expression recognition. To address this issue, we propose a network called Joint Global and Unilateral Local Features (JGULF) for micro-expression recognition. Initially, we employ a Convolutional Neural Network (CNN) and an adjusted Vision Transformer (ViT) model to extract global features from micro-expressions. The local feature extraction module is designed based on global features. The facial features are divided into multiple local regions with varying scales. After that, local feature learning and selection are performed to filter out unilateral local features related to micro-expression movements efficiently. Finally, global and local features are combined to classify micro-expressions. Through comprehensive experimental validation, our algorithm achieves state-of-the-art classification performance on the SMIC, CASMEII, and SAMM micro-expression datasets, demonstrating the effectiveness of combining global features and selecting local features.