Abstract

This study investigates the errors of misclassification associated with Edgeworth Series Distribution (ESD) with a view to assessing the effects of sampling from non-normality. The effects of applying a normal classificatory rule when it is actually a persistent non-normal distribution were examined. These were achieved by comparing the errors of misclassification for ESD with ND using small sample sizes at every level of skewness factor. The simulation procedure for the experiment of the study was implemented using numerical inverse interpolation method in R program to generate a uniformly distributed random variable N. A configuration size of 1000 was obtained for the two training samples drawn at every level of skewness factor (λ3), in the range (0.00625, 0.4). This was repeated for different small sample sizes by comparing errors of misclassification of ESD with ND. The simulation results showed that the optimum probabilities of misclassification by ESD: (E12E) decreases and (E12E) increases, as the skewness factor (λ3) increases. The optimum total probability of misclassification is stable as λ3 also increases. The probability of misclassification E12E ≥E12N and E21E ≥E21N at every level of λ3. Thus, the total probabilities of misclassification are not greatly affected by the skewness factor. This asserts that the normal classification procedure is robust against departure from normality.

Highlights

  • The aim of this study is to investigate errors of misclassification associated with Edgeworth Series Distribution (ESD)

  • This implies that equations (41) and (42) are functions of the skewness factor (λ3 ) in the range (0.00625,0.4)

  • We have investigated the effect of sampling from persistent non-normal distribution by examining the normal classificatory rule when it is an Edge worth Series Distribution (ESD)

Read more

Summary

Introduction

The study of discrimination and classification problems with a view to assessing the effects of departure from the usual assumptions of normality cannot be overemphasized. We are concerned with the existence of two or more groups and a sample of observations from each of the groups. We are required to design a rule based on measurements from these observations to the correct population when we do not know from which of the two populations it emanates [1, 22]. Classification is concerned with prediction or allocation of observations into groups in which a sample of observations is given. The problem is to classify the observations into groups which are as distinct as possible [16]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call