Improvement of CPU time of Linear Discriminant Function based on MNM criterion by IP

Shuichi Shinmura

doi:10.19139/soic.v2i2.52

Abstract

Revised IP-OLDF (optimal linear discriminant function by integer programming) is a linear discriminant function to minimize the number of misclassifications (NM) of training samples by integer programming (IP). However, IP requires large computation (CPU) time. In this paper, it is proposed how to reduce CPU time by using linear programming (LP). In the first phase, Revised LP-OLDF is applied to all cases, and all cases are categorized into two groups: those that are classified correctly or those that are not classified by support vectors (SVs). In the second phase, Revised IP-OLDF is applied to the misclassified cases by SVs. This method is called Revised IPLP-OLDF. In this research, it is evaluated whether NM of Revised IPLP-OLDF is good estimate of the minimum number of misclassifications (MNM) by Revised IP-OLDF. Four kinds of the real data—Iris data, Swiss bank note data, student data, and CPD data—are used as training samples. Four kinds of 20,000 re-sampling cases generated from these data are used as the evaluation samples. There are a total of 149 models of all combinations of independent variables by these data. NMs and CPU times of the 149 models are compared with Revised IPLP-OLDF and Revised IP-OLDF. The following results are obtained: 1) Revised IPLP-OLDF significantly improves CPU time. 2) In the case of training samples, all 149 NMs of Revised IPLP-OLDF are equal to the MNM of Revised IP-OLDF. 3) In the case of evaluation samples, most NMs of Revised IPLP-OLDF are equal to NM of Revised IP-OLDF. 4) Generalization abilities of both discriminant functions are concluded to be high, because the difference between the error rates of training and evaluation samples are almost within 2%. Therefore, Revised IPLP-OLDF is recommended for the analysis of big data instead of Revised IP-OLDF. Next, Revised IPLP-OLDF is compared with LDF and logistic regression by 100-fold cross validation using 100 re-sampling samples. Means of error rates of Revised IPLP-OLDF are remarkable fewer than those of LDF and logistic regression.

Highlights

In this paper, four linear discriminant functions by mathematical programming (MP) are introduced
Revised IPLP-optimal linear discriminant function (OLDF) is defined in two phases as follows: In the first phase, Revised LP-OLDF is applied to all cases, and these cases are categorized in two groups: cases that are classified correctly and cases that are not classified by SVs
All NMs obtained by Revised IPLP-OLDF are the same as the minimum number of misclassifications (MNM) of Revised integer programming (IP)-OLDF

Summary

Introduction

Four linear discriminant functions by mathematical programming (MP) are introduced. We can understand the relation of discriminant functions and NMs. If training data consists of n cases and p-features, n linear equations (Hi(b) = txi ∗ b + 1 = 0) divide p-coefficients space into finite convex polyhedron. Case xi on data space corresponds to linear equation Hi(b) = 0 on discriminant coefficients space, and point bj on coefficients space corresponds to discriminant functions fj(x) = tbj ∗ x + 1. If LDF finds interior point bj in theoretical, this function is free from the unresolved problem This is confirmed by checking that the number of |f (x)| ≤ 10−6 is zero. Revised IP- OLDF resolves problems of discriminant theory This requires more CPU time, because this is solved by IP. Revised IPLP-OLDF is compared with Fisher’s LDF and logistic regression by 100-fold cross validations using 100 re-sampling samples [18, 19]

Fisher’s LDF and logistic regression

IP-OLDF

Revised IP-OLDF and Revised LP-OLDF

Revised IPLP-OLDF

Comparison of Revised IP-OLDF and Revised IPLP-OLDF

Swiss Bank Notes Data

Iris Data

Student Data

CPD Data

Conclusion

Findings

Preparation of Excel

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Statistics, Optimization & Information Computing	Publication Date: Jun 1, 2014
Citations: 16	License type: cc-by

R Discovery Prime

R Discovery Prime

Improvement of CPU time of Linear Discriminant Function based on MNM criterion by IP

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Statistics, Optimization & Information Computing

Lead the way for us

Similar Papers

Improvement of CPU time of Linear Discriminant Function based on MNM criterion by IP
Shuichi Shinmura
Statistics, Optimization & Information Computing | VOL. 2
Shuichi ShinmuraShuichi Shinmura
01 Jun 2014
Statistics, Optimization & Information Computing | VOL. 2

Cancer Gene Analysis of Microarray Data
Shuichi Shinmura
-
Shuichi ShinmuraShuichi Shinmura
01 Jul 2018
01 Jul 2018

New Theory of Discriminant Analysis
Shuichi Shinmura
-
Shuichi ShinmuraShuichi Shinmura
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improvement of CPU time of Linear Discriminant Function based on MNM criterion by IP

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Statistics, Optimization &amp; Information Computing

More From: Statistics, Optimization & Information Computing