Abstract
When facing high dimensional data, dimension reduction is necessary before classification. Among dimension reduction methods, linear discriminant analysis (LDA) is a popular one that has been widely used. LDA aims to maximize the ratio of the between-class scatter and total data scatter in projected space, and the label of each data is necessary. However, in real applications, the labeled data are scarce and unlabeled data are in large quantity, so LDA is hard to be used under such case. In this paper, we propose a novel method named semi-supervised linear discriminant analysis (SLDA), which can use limited number of labeled data and a quantity of the unlabeled ones for training so that LDA can accommodate to the situation of a few labeled data available. Assuming that F represents the calculated class indicator matrix of the training data and Y denotes the true label of the labeled data, the objective function contains two parts: one is the criterion of LDA (which is a function of projection W, and a class indicator matrix F), the other is the difference between the true data label and calculated label of these labeled data. As far as we know, there is no closed-form solution to the objective function. To solve such problem, we develop an iterative algorithm which calculates the class indicator matrix and the projection alternatively. The convergence of the proposed iterative algorithm is proved and confirmed by experiments. The experimental results on eight datasets show that the performance of SLDA is superior to that of traditional LDA and some state-of-the-art semi-supervised algorithms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.