Abstract
Distance learning is an important notion and has played a critical role in success of various machine learning algorithms. Any learning algorithm that requires dissimilarity/similarity measures has to assume some forms of distance functions, either explicitly or implicitly. Hence, in recent years a considerable amount of research has been devoted to distance learning. Despite great achievements in this field, a number of important issues need to be further explored for real world datasets mainly containing categorical attributes. Based on these considerations, the current research presents a Context-Based Distance Learning approach (CBDL) to advance the state of the art existing researches on distance metric learning for categorical datasets. CBDL is designed and developed based on the idea that distance between two values of a given categorical attribute can be estimated by using information inherently exists within subset of attributes called context. CBDL composes of two main components: context extraction component and distance learning component. Context extraction component is responsible for extracting the relevant subset of feature set for a given attribute, while distance learning component tries to learn distance between each pair of values based on the extracted context. To have a comprehensive analysis, we conduct wide range of experiments in both supervised and unsupervised environments in the presence of noise. Our experimental results reveal that CBDL is the method of choice distance learning approach by offering a comparable or better performance compared to the state of the art existing distance learning schemes according to studied evaluation measures. © 2011 Wiley Periodicals, Inc. © 2011 Wiley Periodicals, Inc.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have