Occlusive keypoints has been a challenge for human pose estimation, especially the mutual occlusion of human bodies. One possible solution to this problem is to utilize multi-scale features, where small scale features are capable of identifying keypoints, while large-scale features can capture the relationship between keypoints. Feature fusion among multi-scale features allows for the exchange of information between keypoints, facilitating the inference of occluded keypoints based on the identified keypoints. However, it’s found that there are invalid features in feature fusion which will interfere valid feature. In this paper, we propose multi-scale feature refined network (MSFRNet) based on HRNet and a new attention module namely multi-resolution attention module (MRAM). The proposed MRAM is designed to strengthen the effective information while suppressing redundant information. It has multiple inputs and outputs and can learn the relationships between keypoints while retaining detailed information. The proposed MSFRNet outperforms HRNet, achieving a 1.4[Formula: see text]AP improvement on the COCO dataset with only a marginal computational increase of 0.35 GFLOPs. Additionally, it demonstrates superior performance with a 0.9[Formula: see text]AP, 0.7[Formula: see text]AP, and 1.8[Formula: see text]AP improvement on the MPII, CrowdPose and OCHuman datasets, respectively. Furthermore, compared with the latest attention mechanism PSA, the MSFRNet exhibits lower computational cost while maintaining the same pose-estimation accuracy.