Abstract

Distance metric is widely used in similarity estimation. In this paper we find that the most popular Euclidean and Manhattan distance may not be suitable for all data distributions. A general guideline to establish the relation between a distribution model and its corresponding similarity estimation is proposed. Based on Maximum Likelihood theory, we propose new distance metrics, such as harmonic distance and geometric distance. Because the feature elements may be from heterogeneous sources and usually have different influence on similarity estimation, it is inappropriate to model the distribution as isotropic. We propose a novel boosted distance metric that not only finds the best distance metric that fits the distribution of the underlying elements but also selects the most important feature elements with respect to similarity. The boosted distance metric is tested on fifteen benchmark data sets from the UCI repository and two image retrieval applications. In all the experiments, robust results are obtained based on the proposed methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.