Abstract

As a promising dimensionality reduction and data visualization technique, ISOMAP is usually used for data preprocessing to avoid “the curse of dimensionality” and select more suitable algorithms or improve the performance of algorithms used in data mining process according to No Free Lunch (NFL) Theorem. ISOMAP has only one parameter, i.e. the neighborhood size, upon which the success of ISOMAP depends greatly. However, it’s an open problem how to select a suitable neighborhood size efficiently. Based on the unique feature of shortcut edges, introduced into the neighborhood graph by using the unsuitable neighborhood size, this paper presents an efficient method to select a suitable neighborhood size according to the decrement of the sum of all the shortest path distances. In contrast with the straightforward method with residual variance, our method only requires running the former part of ISOMAP (shortest path computation) incrementally, which makes it less time-consuming, while yielding the same results. Finally, the feasibility and robustness of this method can be verified by experimental results well.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call