Kernel regression is a nonparametric analysis with smoothing method. Smoothing has become synonymous with nonparametric methods used to estimate functions. The purpose of smoothing is to remove variability from data that has no effect so that the characteristics of the data will appear clear. Kernel regression has a flexible form and the mathematical calculations are easy to adjust. In kernel regression, an estimator is known which is usually used to estimate the regression function, namely the Nadaraya-Watson estimator. This study aims to show how to estimate data using nonparametric regression Gaussian and Eponocvh kernels with the Nadaraya-Watson estimator and the bandwidth selection methods are "Rule of Thumb" bandwidth, Unbiased Cross Validation, Biased Cross Validation and Complete Cross Validation. The results of this study indicate that the MSE value generated by the Epanechnikov kernel function and the Gaussian kernel uses the optimal bandwidth. Statistically, the MSE value generated by the Epanechnikov kernel is almost close to the value in the Gaussian kernel, so it can be said that the MSE value produced by the two kernel functions is almost the same. Based on the plot of estimation results for the Eponocvh kernel function and the Gaussian kernel using the optimal bandwidth, it is very close, so it can be said that the use of a different kernel function with the optimal bandwidth for each of the kernel functions will produce the same estimated regression curve. The results of this study support the opinion expressed by Hastie and Tibshirani, which states that in kernel regression the selection of the smoothing parameter (bandwidth) is much more important than choosing the kernel function.
Read full abstract