Objective. Cardiovascular disease (CVD) is a group of diseases affecting cardiac and blood vessels, and short-axis cardiac magnetic resonance (CMR) images are considered the gold standard for the diagnosis and assessment of CVD. In CMR images, accurate segmentation of cardiac structures (e.g. left ventricle) assists in the parametric quantification of cardiac function. However, the dynamic beating of the heart renders the location of the heart with respect to other tissues difficult to resolve, and the myocardium and its surrounding tissues are similar in grayscale. This makes it challenging to accurately segment the cardiac images. Our goal is to develop a more accurate CMR image segmentation approach. Approach. In this study, we propose a regional perception and multi-scale feature fusion network (RMFNet) for CMR image segmentation. We design two regional perception modules, a window selection transformer (WST) module and a grid extraction transformer (GET) module. The WST module introduces a window selection block to adaptively select the window of interest to perceive information, and a windowed transformer block to enhance global information extraction within each feature window. The WST module enhances the network performance by improving the window of interest. The GET module grids the feature maps to decrease the redundant information in the feature maps and enhances the extraction of latent feature information of the network. The RMFNet further introduces a novel multi-scale feature extraction module to improve the ability to retain detailed information. Main results. The RMFNet is validated with experiments on three cardiac data sets. The results show that the RMFNet outperforms other advanced methods in overall performance. The RMFNet is further validated for generalizability on a multi-organ data set. The results also show that the RMFNet surpasses other comparison methods. Significance. Accurate medical image segmentation can reduce the stress of radiologists and play an important role in image-guided clinical procedures.