Abstract

The problem of data classification or data fitting is widely applicable in a multitude of scientific areas, and for this reason, a number of machine learning models have been developed. However, in many cases, these models present problems of overfitting and cannot generalize satisfactorily to unknown data. Furthermore, in many cases, many of the features of the input data do not contribute to learning, or there may even be hidden correlations between the features of the dataset. The purpose of the proposed method is to significantly reduce data classification or regression errors through the usage of a technique that utilizes the particle swarm optimization method and grammatical evolution. This method is divided into two phases. In the first phase, artificial features are constructed using grammatical evolution, and the progress of the creation of these features is controlled by the particle swarm optimization method. In addition, this new technique utilizes penalty factors to limit the generated features to a range of values to make training machine learning models more efficient. In the second phase of the proposed technique, these features are exploited to transform the original dataset, and then any machine learning method can be applied to this dataset. The performance of the proposed method was measured on some benchmark datasets from the relevant literature. Also, the method was tested against a series of widely used machine learning models. The experiments performed showed a significant improvement of 30% on average in the classification datasets and an even greater improvement of 60% in the data fitting datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call