When using the Transformer model for wind power prediction, the presence of noise in wind power data and the model’s final layer relying solely on a simple linear output reduces the model’s ability to capture nonlinear relationships, leading to a decrease in prediction accuracy. To address these issues, this paper proposes an ultrashort-term wind power prediction model based on exponential weighted moving average (EWMA) data processing and Kolmogorov–Arnold Network (KAN)-Transformer. First, multiple variable features are smoothed using EWMA, which suppresses noise while preserving the original data trends. Then, the EWMA-processed data is input into the Encoder and Decoder modules of the Transformer model to extract features. The output from the Decoder layer is then passed through the KAN layer, built using a cubic B-spline function, to enhance the model’s ability to capture nonlinear relationships, thereby improving the prediction accuracy of the Transformer model for wind power. Finally, experimental analysis is conducted, and it shows that the proposed model achieves the highest prediction accuracy, with a mean absolute error of 4.38 MW, a root mean squared error of 7.37 MW, and a coefficient of determination of 98.73%.