We developed a new theory of discriminant analysis (Theory1). Physicians can use it for practical medical diagnoses. Only Revised IP Optimal-LDF (RIP) obtains the minimum number of misclassification (MNM). RIP can discriminate linearly separable data (LSD) theoretically. It discriminated against 169 microarrays with two classes and found that 169 MNMs are zero and LSD. It can split high-dimensional arrays into many small LSD with less than n (patient’s number) genes that are the candidates of multivariate oncogenes. We completed a new theory of high-dimensional gene data analysis (Theory2). A 100-fold Cross-Validation (Method1) can rank all candidates for the importance of diagnosis. Thus, if physicians firstly use Theory2 as the screening method, they can start their medical studies with the correct small sizes of candidates. This paper analyzes four arrays in detail and proposes correctly choosing cancer and normal patients using four principles.
Read full abstract