Increasing the utilization rate of wind energy is of great significance to the improvement of energy structure, which is inseparable from the support of wind power forecasting (WPF) technology. However, it is well known that there is no certain WPF model suitable for all conditions, such as different regions or seasons. Therefore, instead of focusing on the combination of machine learning models in a specific scenario, this article proposes a two-stage modeling strategy of “first classify and separately model, then perform pattern recognition” from the perspective of sample similarity analysis. That is, in offline mode, the historical database is divided into multiple categories with different characteristics, and prediction models are established for each category respectively; in online mode, pattern recognition is carried out on the prediction sample to select the corresponding prediction model. In this way, the WPF problem is decomposed into two strongly related tasks: wind power mode classification and wind power numerical prediction. Furthermore, the coupling and connection between mode classification task and numerical prediction task are strengthened through the transfer learning of sample features. Around the above ideas, specific methods of how to classify, identify, and predict are proposed, which are two-level clustering, Convolutional Neural Network (CNN) classification model and Long Short-term Memory (LSTM) prediction models. Simulation results based on real-world datasets prove the effectiveness and superiority of the proposed hybrid model.