Solar photovoltaic (PV) power has become one of the major renewable energy sources in modern power systems. Hence, solar forecasting is indispensable for addressing the uncertainties of PV power outputs. Since PV power generation mainly depends on solar radiation conditions, satellite-derived methods have been proven effective in large-scale short-term solar forecasting. These methods utilize satellite optical images to track cloud motion and do not require supplemental ground cameras. However, cloud motion between two consecutive satellite images is highly non-stationary, as the motion includes both displacement and shape variation. Conventional motion estimation methods may miss the detailed cloud variation information, increasing the errors in solar forecasting. In this study, a multi-head cloud motion vector prediction method is proposed to address this challenge. The method models cloud variation as the probabilistic sum of pixels moving under multiple optical flow vectors. Time-varying solar intensity is also considered in the method. In order to estimate these optical flow vectors, a learning-based intelligent model with a self-attention structure is established. The proposed method of multi-step-ahead satellite-derived solar forecasting is superior to conventional methods, as the results of case studies indicate that the proposed method improves the precision of intra-hourly PV power prediction.