Person re-identification (Re-ID) is focused on identifying and matching the same pedestrian when captured across various surveillance cameras. However, variations in camera performance and the distances between pedestrians and cameras may result in capturing images of the same person at different resolutions. This specific issue, referred to as cross-resolution person re-identification (CRReID), presents considerable difficulties in achieving accurate person Re-ID. To address this issue, we propose a Multi-level and scale Deep Invariant Feature learning Framework (MDIFF), which effectively tackles the problem of cross-resolution person matching. Our MDIFF reconstructs person features at the shallow layers to alleviate information gaps and extracts resolution-invariant features at the deep layers for cross-resolution person matching. First, to mitigate information loss in resolution-invariant features, we propose a Dual Input Feature Reconstruction (DIFR) structure incorporating dual-stream input and a lightweight decoder, constrained by degradation loss and image reconstruction loss. Second, we propose a Multi-level Global–Local feature Interaction and Fusion (MGLIF) module to enhance the invariant features of persons and obtain deep invariant representations, making the final representation more robust to resolution changes and more discriminative. Finally, to make the feature distribution of the same identity more compact across different resolutions, we propose a cross-resolution joint loss optimization strategy, including cross-resolution triplet loss, cross-resolution center loss, and identity loss. Our comprehensive experimental results demonstrate the superior performance and efficacy of our MDIFF, outperforming current state-of-the-art methods across various CRReID benchmark datasets. Our code is available at https://github.com/MiSanl/MDIFF-for-CRReID.