The SCADA system installed in each wind farm provides real-time and historical data required for fault diagnosis. However, labeling the data requires time and effort. Meanwhile, the performance of failure prediction and diagnostic relies on the volume of labeled failure data. Therefore, a fault diagnosis and prediction scheme for wind turbines based on active learning is proposed. Firstly, historical data from a selected wind turbine is screened out by residual analysis to set up the initial training data sets. Then a classifier is pre-trained by an improved active learning method. The active learning model based on the committee selection (QBC) sampling strategy and random forest (RF) learner is combined to realize the data labeling and, therefore, wind turbine fault detection. Afterward, the pre-trained classifier is applied to different wind turbines in the same/different wind farms to decide their operational status by labeling their operational data. Finally, the labeled data from different resources are fed back to retrain the classifier to improve its performance. Data from the SCADA system of wind turbines of different manufacturers in different wind farms are used as test cases, and simulation results show the effectiveness and feasibility of the proposed scheme.