Abstract

In this paper, we present a new method to deal with the Iris data classification problem based on the distribution of training instances. First, we find two useful attributes of the Iris data from the training instances that are more suitable to deal with the classification problem. It means that the distribution of the values of these two useful attributes of the three species (i.e., Setosa, Versicolor and Virginica) has less overlapping. Then, we calculate the average attribute values and the standard deviations of these two useful attributes. We also calculate the overlapping areas formed by the values of these two useful attributes between species of the training instances, the average attribute values, and the standard deviations of the values of these two useful attributes of each species. Then, we calculate the difference between the values of these two useful attributes of a testing instance to be classified and the values of these two useful attributes of each species of the training instances. We choose the species that has the smallest difference between the values of these two useful attributes of the testing instance and the values of these two useful attributes of each species of the training instances as the classification result of the testing instance. The proposed method gets a higher average classification accuracy rate than the existing methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.