Abstract

In reality, when processing data sets for classification, there are often missing data sets, which brings inconvenience to the classification work. To this end, this paper proposes a method to impute incomplete data based on the interval value of K nearest neighbors. The method uses the Euclidean distance between the incomplete data and the complete data to find the K closest complete data to the incomplete data, so that the nearest neighbor can be constructed according to the corresponding attribute value of the complete data to the missing attribute value of the incomplete data. interval. Next, the dataset is constructed into an interval-valued dataset. Based on the interval-valued distance algorithm, the incomplete data can be classified by the K-nearest neighbor algorithm. The experimental results show that the improved K-nearest neighbor algorithm based on interval value imputation is more efficient than the traditional 0-value imputation, median imputation and mean imputation K-nearest neighbor algorithm under certain circumstances.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call