The health perception of bearing-rotor systems and their remaining useful life prediction has been a critical and challenging theme in the field of Prognostic and Health Management (PHM). Deep learning has become a prominent area of PHM research. However, current models have difficulty in adequately extracting the deep degradation characteristics of bearings and effectively capturing time-series information during the failure process. Also, most remaining useful life (RUL) prediction methods focus on point estimation, limiting their ability to quantify prediction uncertainty. To address these shortcomings, this study proposes a novel health perception and prediction framework, the Spatiotemporal Self-Attention Mechanism Probabilistic model (STAP-Net). The framework embodies the principles of lightweight design, focusing, and probabilistic approaches, and is tailored for bearing rotor systems operating under unique conditions. The key innovation of STAP-Net is the integration of a modified gate recurrent unit, known as the Weight Diminish Recurrent Unit (WDRU). It greatly reduces the training parameters of the proposed STAP-Net framework and improves the convergence speed of the framework while ensuring the prediction accuracy. Through analyzing the bearing-rotor system degradation data, the efficacy of STAP-Net is validated under special operating conditions such as misalignment and abrasive wear. The superior performance of the proposed framework is evaluated and confirmed based on 3 key metrics: high-precision point prediction, suitable prediction intervals, and reliable probabilistic prediction results.