During data-driven process condition optimization on a laboratory scale, only a small-size data set is accessible and should be effectively utilized. On the other hand, during process development, new operations are frequently inserted or current operations are modified. These accessible data sets are somewhat related but not exactly the same type. In this study, we focus on the prediction of the quality of the interface between an insulator and GaN as a semiconductor for the potential application of GaN power semiconductor devices. The quality of the interface was represented as the interface state density, Dit, and the inserted operation to the process was the ultraviolet (UV)/O3-gas treatment. Our retrospective evaluation of model-building approaches for Dit prediction from a process condition revealed that for the UV/O3-treated interfaces, data of interfaces without the treatment contributed to performance improvement. Such performance improvement was not observed when using a data set of Si as the semiconductor. As a modeling method, the automatic relevance vector-based Gaussian process regression with the prior distribution of the length-scale parameters exhibited a relatively high predictive performance and represented a reasonable uncertainty of prediction as reflected by the distance to the training data set. This feature is a prerequisite for a potential application of Bayesian optimization. Furthermore, hyperparameters in the prior distribution of the length-scales could be optimized by leave-one-out cross-validation.
Read full abstract