Value difference metric (VDM) is one of the widely used distance functions designed to work with nominal attributes. Research has indicated that the definition of VDM follows naturally from a simple probabilistic model called a naive Bayes (NB). NB assumes that all the attributes are independent given the class. To further improve the performance of NB, several techniques have been proposed. Among these, an effective technique is local learning. Because VDM has a close relationship with NB, in this paper, we propose a local learning method for VDM. The improved distance function is called local value difference metric (LVDM). When LVDM computes the distance between a test instance and each training instance, the conditional probabilities in VDM are estimated by counting from the neighborhood of the test instance only instead of from all the training data. A modified decision tree algorithm is proposed to determine the neighborhood of the test instance. The experimental results on 43 datasets downloaded from the University of California at Irvine (UCI) show that the proposed LVDM significantly outperforms VDM in terms of the class-probability estimation performance of distance-based learning algorithms.
Read full abstract