Insufficient annotated samples coupled with class imbalance problem largely restrict the wide application of deep learning (DL)-based approach in microstructure recognition and quantification. In this work, we present a micrograph augmentation approach using the hybrid deep generative model to generate SEM image-annotation pairs for the establishment of a large-scale and well-balanced augmentation dataset. In this method, a generator is established to produce the desired annotations and then a translator is trained to translate these synthetic annotations into high-quality SEM images. The proposed method is successfully applied to an extremely small and imbalanced additively manufactured (AM) steel dataset containing only one SEM image-annotation pair with a very low martensite/austenite (MA) fraction, to significantly augment the initial dataset and achieve a more balanced distribution of phase fraction. The effectiveness of the present method is well demonstrated by the fact that the extensibility of microstructure recognition model to unseen micrographs is improved through the utilization of synthetic data. Furthermore, the impact of synthetic data proportion on the model's performance and the underlying reasons for synthetic data to improve the extensibility of trained models are also discussed in detail.
Read full abstract