Solar flares originate from the sudden release of energy stored in the magnetic field of the active region on the Sun, but the trigger for flares is still uncertain. Currently, deep-learning-based solar flare prediction models have achieved good results and are widely recognized. However, these models focus more on data correlation rather than causality. An ideal flare prediction model should probe into the causes/triggers of solar flares, and diagnose the precursors of flares rather than just correlation analysis. To extract more informative precursors of flares from magnetograms, while suppressing the interference of confounding factors, a causal attention module is introduced to disentangle causal and confounder features from the input features. To address the problem of imbalanced positive and negative samples in the data set, an adaptive data set split mechanism is proposed. It divides the data set into several balanced subsets of positive and negative samples, and dynamically adjusts the subsets according to the model’s prediction results during the training process. The experimental results demonstrate that our proposed model achieves 4.08%, 8.38%, and 2.19% higher accuracy, true skill score, and area under the receiver operating characteristic curve than the baseline model. Additionally, the class-specific heatmaps by using the gradient-weighted class activation mapping method reveal that our proposed model generally focuses on the polarity inverse line of active regions, well in line with theoretical study.