Abstract

A high quality distance function that measures the difference between instances is essential in many real-world applications and research fields. For example, in instance-based learning, the distance function plays the most important role. A large number of distance functions have been proposed. For nominal attributes, Value Difference Metric (VDM) is one of the state-of-the-art and widely used distance functions. However, it needs to estimate the conditional probabilities, which drops its efficiency in computing the distance between instances. Besides, a practical issue that arises in estimating the conditional probabilities is that the denominators can be zero or very small. This makes them either undefined or very large. Therefore, an efficient distance function that can measure the difference between two instances but without the practical issue confronting VDM is desirable. In this paper, we propose a novel distance function: Frequency Difference Metric (FDM). FDM is just based on the joint frequencies of class labels and attribute values, instead of the conditional probabilities. Extensive empirical studies show that FDM performs almost as well as VDM in terms of accuracy, but significantly outperforms VDM in terms of efficiency. This work provides a very simple, efficient, and effective distance function that can be widely used in many real-world applications and research fields.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call