Abstract

The theory together with an algorithm for uncorrelated linear discriminant analysis (ULDA) is introduced and applied to explore metabolomics data. ULDA is a supervised method for feature extraction (FE), discriminant analysis (DA) and biomarker screening based on the Fisher criterion function. While principal component analysis (PCA) searches for directions of maximum variance in the data, ULDA seeks linearly combined variables called uncorrelated discriminant vectors (UDVs). The UDVs maximize the separation among different classes in terms of the Fisher criterion. The performance of ULDA is evaluated and compared with PCA, partial least squares discriminant analysis (PLS-DA) and target projection discriminant analysis (TP-DA) for two datasets, one simulated and one real from a metabolomic study. ULDA showed better discriminatory ability than PCA, PLS-DA and TP-DA. The shortcomings of PCA, PLS-DA and TP-DA are attributed to interference from linear correlations in data. PLS-DA and TP-DA performed successfully for the simulated data, but PLS-DA was slightly inferior to ULDA for the real data. ULDA successfully extracted optimal features for discriminant analysis and revealed potential biomarkers. Furthermore, by means of cross-validation, the classification model obtained by ULDA showed better predictive ability than PCA, PLS-DA and TP-DA. In conclusion, ULDA is a powerful tool for revealing discriminatory information in metabolomics data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call