Deep learning has significantly advanced the field of medical image segmentation. However, the complexity of network structures often leads to high computational demands, limiting their practical efficiency. To enhance the efficiency of image segmentation, this paper introduces an innovative, concise, and lightweight deep learning network. First, to reduce model complexity, we replaced the attention mechanism in the traditional vision transformer (ViT) structure with a shift operation, creating the ShiftViT architecture. This substitution significantly decreased computation and the number of parameters while preserving model performance. Second, to retain and enhance fine-grained features and facilitate more precise information transfer across different layers, we employed a full-scale progressive skip connection strategy. This approach effectively integrates multi-scale feature information, further enhancing model performance. Additionally, to further reduce network complexity, inspired by the independence of probabilities, we opted for depth-wise separable convolution over traditional convolution. This enhances the relative independence between layers. Together, these modifications achieved superior segmentation results on both the Synapse and Automated Cardiac Diagnostic Challenge (ACDC) datasets compared to mainstream models, with substantial advantages in terms of computational efficiency and parameter count. The proposed approach represents an effective solution for medical image applications with limited computational resources and holds great promise for clinical practice.