Abstract

Data incompleteness is a serious challenge in real-world machine-learning tasks. Nevertheless, it has not received enough attention in symbolic regression (SR). Data missingness exacerbates data shortage, especially in domains with limited available data, which in turn limits the learning ability of SR algorithms. Transfer learning (TL), which aims to transfer knowledge across tasks, is a potential solution to solve this issue by making amends for the lack of knowledge. However, this approach has not been adequately investigated in SR. To fill this gap, a multitree genetic programming-based TL method is proposed in this work to transfer knowledge from complete source domains (SDs) to incomplete related target domains (TDs). The proposed method transforms the features from a complete SD to an incomplete TD. However, having many features complicates the transformation process. To mitigate this problem, we integrate a feature selection mechanism to eliminate unnecessary transformations. The method is examined on real-world and synthetic SR tasks with missing values to consider different learning scenarios. The obtained results not only show the effectiveness of the proposed method but also show its training efficiency compared with the existing TL methods. Compared to state-of-the-art methods, the proposed method reduced an average of more than 2.58% and 4% regression error on heterogeneous and homogeneous domains, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call