Abstract

High-dimensional data are pervasive in this bigdata era. To avoid the curse of the dimensionality problem, various dimensionality reduction (DR) algorithms have been proposed. To facilitate systematic DR quality comparison and assessment, this paper reviews related metrics and develops an open-source Python package pyDRMetrics. Supported metrics include reconstruction error, distance matrix, residual variance, ranking matrix, co-ranking matrix, trustworthiness, continuity, co-k-nearest neighbor size, LCMC (local continuity meta criterion), and rank-based local/global properties. pyDRMetrics provides a native Python class and a web-oriented API. A case study of mass spectra is conducted to demonstrate the package functions. A web GUI wrapper is also published to support user-friendly B/S applications.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call