Linear discriminant analysis (LDA) is a powerful supervised dimensionality reduction method for analysing high-dimensional data. However, LDA cannot use locality information in data, which makes LDA degrade dramatically in performance on multimodal data. A number of LDA variants have been proposed to exploit locality information in data, including subclass-based LDAs. We discover a problem with these variants, which is that subclasses are selected on a within-class basis without considering other classes. This causes the loss of important information at class boundaries. In this paper, we present a novel variant of subclass-based LDA, Global Subclass Discriminant Analysis (GSDA). Unlike other subclass-based LDAs, GSDA selects subclasses from global clusters that may cross class boundaries, thus utilising within-class information and between-class information. More specifically, GSDA applies an effective clustering algorithm to the whole data to construct global clusters. It then utilises the local structure refining strategy on these global clusters to construct subclasses. Finally, GSDA learns a representative data subspace by maximising inter-subclass distance and minimising intra-subclass distance simultaneously. GSDA is extensively evaluated on a wide range of public datasets through comparison with the state-of-the-art LDA algorithms. Experimental results demonstrate its superiority in terms of accuracy and run times.
Read full abstract