Abstract

Dimensionality reduction is a fundamental task in the field of data mining and machine learning. In many scenes, examples in high-dimensional space usually lie on low-dimensional manifolds; thus, learning the low-dimensional embedding is important. Some well-known methods, such as LPP and LE, adopt a locality-preserving strategy by constructing an adjacent graph and using the graph Laplacian to project raw examples into subspace in order to obtain the low-dimensional representation. Accordingly, in this paper, we propose a novel neighbors-based distance that measures the distance of two examples through their neighbors. To be more specific, we create a virtual bridge point from the neighbors of each example and use it to link with others. Instead of computing their direct Euclidean distance, we derive the distance of any two examples using their bridge points. We note that the introduced metric shows a high discriminative ability for the examples on the boundary, which known to be infamously hard examples. Extensive experiments on classification and clustering demonstrate that our proposed graph construction method can achieve a large margin improvement in spite of its simple form.

Highlights

  • The ability to learn low-dimensional representation embedded in high-dimensional space plays a very important role in the field of data mining and machine learning

  • RELATED WORKS we discuss the graph construction methods and several classical manifold learning methods which we equip with our defined graph to make a comparison include Laplacian eigenmaps, Locality preserving projections and Isomaps

  • ADJACENT GRAPH CONSTRUCTION A discriminative and informative graph is very important for many applications, in the graph, the weight of edge reflect the similarity between examples

Read more

Summary

INTRODUCTION

The ability to learn low-dimensional representation embedded in high-dimensional space plays a very important role in the field of data mining and machine learning. One encouraging observation in the learning of the embedded low-dimensional manifold is that the hard examples are pushed away from their locations in high-dimensional space with our new metric. The contributions of this paper can be summarized as follows: a) We propose a new distance metric that considers the neighbor information This new metric can handle the hard examples better than the Euclidean distance; b) We provide a theoretical analysis of why our new method works, in Theorem 1; c) We present classification and clustering results using our proposed method on four public datasets, showing that our method achieves better performance than the baseline methods

RELATED WORKS
MANIFOLD LEARNING METHODS FOR DIMENSIONALITY REDUCTION
MOTIVATION AND DEFINITION
ALGORITHM
ANALYSIS OF NEIGHBORS-BASED DISTANCE
MODIFICATION OF THE NEIGHBORHOOD VECTOR
EXPERIMENTS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call