A challenge in gas turbine fault diagnosis is that labeled fault samples are relatively rare and much fewer than normal samples. Conventional data augmentation techniques generate fault samples in original data spaces, resulting in the issue that synthetic fault samples highly overlap with normal samples. Aiming at the issue, a feature-level data augmentation method, namely feature-level SMOTE, is developed by integrating deep Siamese multi-head self-attention network (DSMHSA) with synthetic minority over-sampling technique (SMOTE) to reduce inter-class imbalance and overlap simultaneously. First, the DSMHSA maps original data into a feature space with better inter-class separability, in which inter-class samples stay far away from one another. Second, the SMOTE generates synthetic fault samples in the well-separable space, in order to balance the data set. Finally, the effectiveness of the developed feature-level SMOTE in imbalanced fault diagnosis has been evaluated through two case studies including the real gas turbine fault dataset and the public robot execution failures dataset. To be specific, its average balanced accuracy is 90.38% on the gas turbine dataset, yielding 9.67%, 13.94%, and 12.39% improvements compared to the OUPS, A-SUWO, and NRAS, respectively.