Although the uterus, bladder, and rectum are distinct organs, their muscular fasciae are often interconnected. Clinical experience suggests that they may share common risk factors and associations. When one organ experiences prolapse, it can potentially affect the neighboring organs. However, the current assessment of disease severity still relies on manual measurements, which can yield varying results depending on the physician, thereby leading to diagnostic inaccuracies. This study aims to develop a multilabel grading model based on deep learning to classify the degree of prolapse of three organs in the female pelvis using stress magnetic resonance imaging (MRI) and provide interpretable result analysis. We utilized sagittal MRI sequences taken at rest and during maximum Valsalva maneuver from 662 subjects. The training set included 464 subjects, the validation set included 98 subjects, and the test set included 100 subjects (training set n=464, validation set n=98, test set n=100). We designed a feature extraction module specifically for pelvic floor MRI using the vision transformer architecture and employed label masking training strategy and pre-training methods to enhance model convergence. The grading results were evaluated using Precision, Kappa, Recall, and Area Under the Curve (AUC). To validate the effectiveness of the model, the designed model was compared with classic grading methods. Finally, we provided interpretability charts illustrating the model's operational principles on the grading task. In terms of POP grading detection, the model achieved an average Precision, Kappa coefficient, Recall, and AUC of 0.86, 0.77, 0.76, and 0.86, respectively. Compared to existing studies, our model achieved the highest performance metrics. The average time taken to diagnose a patient was 0.38s. The proposed model achieved detection accuracy that is comparable to or even exceeds that of physicians, demonstrating the effectiveness of the vision transformer architecture and label masking training strategy for assisting in the grading of POP under static and maximum Valsalva conditions. This offers a promising option for computer-aided diagnosis and treatment planning of POP.